JP5684342B2

JP5684342B2 - Method and apparatus for processing digital video data

Info

Publication number: JP5684342B2
Application number: JP2013161889A
Authority: JP
Inventors: シタラマン・ガナパシー・サブラマニア; ファン・シ; ペイソン・チェン; セイフラー・ハリト・オグズ; スコット・ティー．・スワゼイ; ビノド・カウシック
Original assignee: Qualcomm Inc
Current assignee: Qualcomm Inc
Priority date: 2013-08-02
Filing date: 2013-08-02
Publication date: 2015-03-11
Anticipated expiration: 2027-05-04
Also published as: JP2014003648A

Description

この開示は、映像コーディングに関するものである。この開示は、より具体的には、映像シーケンスをコーディングするためにコーディングコストを推定することに関するものである。 This disclosure relates to video coding. This disclosure relates more specifically to estimating coding costs for coding video sequences.

デジタル映像能力は、デジタルテレビ、デジタル直接放送システム、無線通信デバイス、パーソナルデジタルアシスタント（ＰＤＡ）、ラップトップコンピュータ、デスクトップコンピュータ、ビデオゲームコンソール、デジタルカメラ、デジタル記録デバイス、携帯電話、衛星無線電話、等を含む広範なデバイス内に組み込むことができる。デジタル映像デバイスは、映像シーケンスを処理及び送信する際に従来のアナログ映像システムの有意な改良を提供することができる。 Digital video capability includes digital TV, digital direct broadcasting system, wireless communication device, personal digital assistant (PDA), laptop computer, desktop computer, video game console, digital camera, digital recording device, mobile phone, satellite wireless telephone, etc. Can be incorporated into a wide range of devices. Digital video devices can provide significant improvements over conventional analog video systems in processing and transmitting video sequences.

デジタル映像シーケンスをコーディングするために異なる映像コーディング基準が確立されている。例えば、ムービング・ピクチャ・エキスパーツ・グループ（ＭＰＥＧ）は、ＭＰＥＧ−１、ＭＰＥＧ−２及びＭＰＥＧ−４を含む幾つかの基準を開発している。その他の例は、国際電気通信連合（ＩＴＵ）−ＴＨ．２６３基準と、ＩＴＵ−ＴＨ．２６４基準及びその同等基準であるＩＳＯ／ＩＥＣＭＰＥＧ−４、Ｐａｒｔ−１０、すなわちアドバンストビデオコーディング（ＡＶＣ）と、を含む。これらの映像コーディング基準は、データを圧縮された形でコーディングすることによって映像シーケンスの向上された送信効率をサポートする。 Different video coding standards have been established for coding digital video sequences. For example, the Moving Picture Experts Group (MPEG) has developed several standards including MPEG-1, MPEG-2 and MPEG-4. Other examples are the International Telecommunication Union (ITU) -TH. 263 standard and ITU-T H.264. H.264 standard and its equivalent standard, ISO / IEC MPEG-4, Part-10, that is, Advanced Video Coding (AVC). These video coding standards support improved transmission efficiency of video sequences by coding data in a compressed form.

多くの現在の技法は、ブロックに基づくコーディングを利用する。ブロックに基づくコーディングにおいては、マルチメディアシーケンスのフレームは、個別の画素ブロックに分割され、これらの画素ブロックが、同じフレーム内又は異なるフレームに所在することができるその他のブロックとの差分に基づいてコーディングされる。幾つかの画素ブロックは、“マクロブロック”としばしば呼ばれ、画素のサブブロックから成るグループを備える。一例として、１６×１６マクロブロックは、４つの８×８サブブロックを備えることができる。これらのサブブロックは、別々にコーディングすることができる。例えば、Ｈ．２６４基準は、様々な異なるブロックサイズ、例えば１６×１６、１６×８、８×１６、８×８、４×４、８×４、及び４×８、を有するブロックのコーディングを可能にする。さらに、拡大として、あらゆるサイズのサブブロック、例えば、２×１６、１６×２、２×２、４×１６、及び８×２、をマクロブロック内に含めることができる。 Many current techniques utilize block-based coding. In block-based coding, a frame of a multimedia sequence is divided into individual pixel blocks that are coded based on differences from other blocks that can be located in the same frame or in different frames. Is done. Some pixel blocks, often referred to as “macroblocks”, comprise a group of sub-blocks of pixels. As an example, a 16 × 16 macroblock can comprise four 8 × 8 sub-blocks. These sub-blocks can be coded separately. For example, H.M. The H.264 standard allows coding of blocks with a variety of different block sizes, eg, 16 × 16, 16 × 8, 8 × 16, 8 × 8, 4 × 4, 8 × 4, and 4 × 8. Furthermore, as an extension, sub-blocks of any size, eg 2 × 16, 16 × 2, 2 × 2, 4 × 16, and 8 × 2, can be included in the macroblock.

この開示の一定の側面においては、デジタル映像データを処理するための方法は、量子化されたときにゼロでないままである画素ブロックの残差データ（ｒｅｓｉｄｕａｌｄａｔａ）に関する１つ以上の変換係数を識別することと、少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定することと、前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定すること、とを備える。 In certain aspects of this disclosure, a method for processing digital video data identifies one or more transform coefficients for pixel block residual data that remains non-zero when quantized. Estimating the number of bits associated with coding of the residual data based on at least the identified transform coefficient; and at least the estimated bits associated with coding the residual data Estimating a coding cost for coding the pixel block based on a number.

一定の側面においては、デジタル映像データを処理するための装置は、画素ブロックの残差データに関する変換係数を生成する変換モジュールと、量子化されたときにゼロでないままである前記変換係数のうちの１つ以上を識別し及び少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定するビット推定モジュールと、前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定する制御モジュールと、を備える。 In certain aspects, an apparatus for processing digital video data includes: a transform module that generates transform coefficients for residual data of a pixel block; and the transform coefficients that remain non-zero when quantized. A bit estimation module that identifies one or more and estimates a number of bits associated with coding of the residual data based at least on the identified transform coefficients; and at least associated with coding the residual data A control module for estimating a coding cost for coding the pixel block based on the estimated number of bits.

一定の側面においては、デジタル映像データを処理するための装置は、量子化されたときにゼロでないままである画素ブロックの残差データに関する１つ以上の変換係数を識別するための手段と、少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定するための手段と、前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定するための手段と、を備える。 In certain aspects, an apparatus for processing digital video data comprises: means for identifying one or more transform coefficients for pixel block residual data that remains non-zero when quantized; Means for estimating the number of bits associated with the coding of the residual data based on the identified transform coefficient; and based on at least the estimated number of bits associated with coding the residual data. Means for estimating a coding cost for coding the pixel block.

一定の側面においては、デジタル映像データを処理するためのコンピュータプログラム製品は、命令が格納されているコンピュータによって読み取り可能な媒体を備える。前記命令は、量子化されたときにゼロでないままである画素ブロックの残差データに関する１つ以上の変換係数を識別するための符号と、少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定するための符号と、前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定するための符号と、を含む。 In certain aspects, a computer program product for processing digital video data comprises a computer readable medium having instructions stored thereon. The instructions include a code for identifying one or more transform coefficients for pixel block residual data that remains non-zero when quantized, and the residual data based at least on the identified transform coefficients Estimating a coding cost for coding the pixel block based on a code for estimating the number of bits associated with the coding of and at least the estimated number of bits associated with coding the residual data The code | symbol for doing.

１つ以上の例の詳細が添付図面及び以下の説明において示される。その他の特徴、目的、及び利点が、以下の説明と図面から、及び請求項から明確になるであろう。 The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description and drawings, and from the claims.

ここにおいて説明されるコーディングコスト推定技法を採用する映像コーディングシステムを示すブロック図である。1 is a block diagram illustrating a video coding system that employs a coding cost estimation technique described herein. FIG. 典型的符号化モジュールをさらに詳細に示すブロック図である。FIG. 3 is a block diagram illustrating an exemplary encoding module in further detail. 他の典型的符号化モジュールをさらに詳細に示すブロック図である。FIG. 3 is a block diagram illustrating another exemplary encoding module in further detail. 推定されたコーディングコストに基づいて符号化モードを選択する符号化モジュールの典型的動作を示す流れ図である。5 is a flow diagram illustrating an exemplary operation of an encoding module that selects an encoding mode based on an estimated coding cost. ブロックの残差データを量子化又は符号化せずに残差データをコーディングすることと関連づけられたビット数を推定する符号化モジュールの典型的動作を示す流れ図である。FIG. 5 is a flow diagram illustrating an exemplary operation of an encoding module that estimates the number of bits associated with coding residual data without quantizing or encoding the residual data of the block. ブロックの残差データを符号化せずに残差データをコーディングすることと関連づけられたビット数を推定する符号化モジュールの典型的動作を示す流れ図である。FIG. 5 is a flow diagram illustrating an exemplary operation of an encoding module that estimates the number of bits associated with coding residual data without encoding the residual data of the block.

この開示は、推定されたコーディングコストを用いた映像コーディングモード選択技法を説明する。例えば高い圧縮効率を提供するために、符号化デバイスは、画素ブロックのデータを高い効率でコーディングする画素ブロックコーディングモードを選択するのを試みることができる。この目的のために、符号化デバイスは、少なくとも可能なモードの少なくとも一部に関するコーディングコスト推定値に基づいてコーディングモード選択を行うことができる。ここにおいて説明される技法により、符号化デバイスは、ブロックを実際にコーディングせずに異なるモードに関するコーディングコストを推定する。実際、幾つかの側面においては、符号化モジュールデバイスは、各モードに関するブロックのデータを量子化せずにモードに関するコーディングコストを推定することができる。この方法により、この開示のコーディングコスト推定技法は、有効なモード選択を行うために必要な計算集約的計算量を低減させる。 This disclosure describes video coding mode selection techniques using estimated coding costs. For example, to provide high compression efficiency, the encoding device may attempt to select a pixel block coding mode that codes pixel block data with high efficiency. For this purpose, the encoding device can make a coding mode selection based on a coding cost estimate for at least some of the possible modes. With the techniques described herein, an encoding device estimates coding costs for different modes without actually coding the block. Indeed, in some aspects, the encoding module device can estimate the coding cost for a mode without quantizing the block of data for each mode. By this method, the disclosed coding cost estimation technique reduces the computationally intensive amount of computation required to make an effective mode selection.

図１は、ここにおいて説明されるコーディングコスト推定技法を採用するマルチメディアコーディングシステム１０を示したブロック図である。コーディングシステム１０は、送信チャネル１６によって接続された符号化デバイス１２と復号デバイス１４とを含む。符号化デバイス１２は、１つ以上のデジタルマルチメディアデータシーケンスを符号化し、符号化されたシーケンスを復号のために及び可能なことにデバイス１４のユーザーに提示するために送信チャネル１６において復号デバイス１４に送信する。送信チャネル１６は、あらゆる有線又は無線媒体、又はその組み合わせを備えることができる。 FIG. 1 is a block diagram illustrating a multimedia coding system 10 that employs the coding cost estimation techniques described herein. The coding system 10 includes an encoding device 12 and a decoding device 14 connected by a transmission channel 16. Encoding device 12 encodes one or more digital multimedia data sequences and decodes device 14 in transmission channel 16 for decoding and possibly presenting the encoded sequence to a user of device 14. Send to. Transmission channel 16 may comprise any wired or wireless medium, or combination thereof.

符号化デバイス１２は、１つ以上のマルチメディアデータチャネルをブロードキャストするために用いられるブロードキャストネットワーク構成要素の一部を形成することができる。一例として、符号化デバイス１２は、符号化されたマルチメディアデータの１つ以上のチャネルを無線デバイスにブロードキャストするために用いられる無線基地局、サーバー、又はいずれかのインフラストラクチャノードの一部を形成することができる。この場合は、符号化デバイス１２は、符号化されたデータを複数の無線デバイス、例えば復号デバイス１４、に送信することができる。しかしながら、単純化するために図１には単一の復号デバイス１４が示される。代替として、符号化デバイス１２は、映像テレフォニー又はその他の類似の用途に関してローカルでキャプチャされた映像を送信するハンドセットを備えることができる。 Encoding device 12 may form part of a broadcast network component that is used to broadcast one or more multimedia data channels. As one example, encoding device 12 forms part of a wireless base station, server, or any infrastructure node used to broadcast one or more channels of encoded multimedia data to the wireless device. can do. In this case, the encoding device 12 can transmit the encoded data to a plurality of wireless devices, eg, the decoding device 14. However, for simplicity, a single decoding device 14 is shown in FIG. Alternatively, encoding device 12 may comprise a handset that transmits locally captured video for video telephony or other similar applications.

復号デバイス１４は、符号化デバイス１２によって送信された符号化されたマルチメディアデータを受信してそのマルチメディアデータをユーザーに提示するために復号するユーザーデバイスを備えることができる。一例として、復号デバイス１４は、デジタルテレビ、無線通信デバイス、ゲームプレイ装置、ポータブルデジタルアシスタント（ＰＤＡ）、ラップトップコンピュータ、デスクトップコンピュータ、デジタル音楽及び映像デバイス、例えば商標“ｉＰｏｄ”の名称で販売されるデバイス、ラジオテレフォン、例えばセルラー、衛星又は地上に基づくラジオテレフォン、又は映像及び／又は音声ストリーミング、ビデオテレフォニー、又はその両方に関して装備されるその他の無線移動端末、の一部として実装することができる。復号デバイス１４は、移動デバイス又は静止デバイスと関連づけることができる。ブロードキャスト用途においては、符号化デバイス１２は、複数のユーザーと関連づけられた複数の復号デバイス１４に符号化された映像及び／又は音声を送信することができる。 The decoding device 14 may comprise a user device that receives the encoded multimedia data transmitted by the encoding device 12 and decodes the multimedia data for presentation to the user. As an example, the decoding device 14 is sold under the name “iPod”, a digital television, a wireless communication device, a game play device, a portable digital assistant (PDA), a laptop computer, a desktop computer, a digital music and video device, for example. It can be implemented as part of a device, a radiotelephone such as a cellular, satellite or ground based radiotelephone, or other wireless mobile terminal equipped for video and / or audio streaming, video telephony, or both. The decoding device 14 can be associated with a mobile device or a stationary device. In broadcast applications, encoding device 12 may transmit encoded video and / or audio to a plurality of decoding devices 14 associated with a plurality of users.

幾つかの側面においては、双方向通信用途に関して、マルチメディアコーディングシステム１０は、セッション開始プロトコル（ＳＩＰ）、国際電気通信連合標準化セクター（ＩＴＵ−Ｔ）Ｈ．３２３基準、ＩＴＵ−ＴＨ．３２４基準、又はその他の基準に従ってビデオテレフォニー又は映像ストリーミングをサポートすることができる。一方向又は双方向通信に関して、符号化デバイス１２は、映像圧縮基準、例えばムービング・ピクチャ・エキスパーツ・グループ（ＭＰＥＧ）−２、ＭＰＥＧ−４、ＩＴＵ−ＴＨ．２６３、又は、ＭＰＥＧ−４、Ｐａｒｔ１０、アドバンストビデオコーディング（ＡＶＣ）に対応するＩＴＵ−Ｈ．２６４、に従って符号化されたマルチメディアデータを生成することができる。図１には示されていないが、符号化デバイス１２及び復号デバイス１４は、音声符号器及び復号器とそれぞれ一体化することができ、共通のデータシーケンス又は別個のデータシーケンス内の音声及び映像の両方の符号化を処理するための適切なマルチプレクサ−デマルチプレクサ（ＭＵＸ−ＤＥＭＵＸ）モジュール、又はその他のハードウェア、ファームウェア、又はソフトウェアを含むことができる。該当する場合は、ＭＵＸ−ＤＥＭＵＸモジュールは、ＩＴＵ−Ｈ．２２３マルチプレクサプロトコル、又はユーザーデータグラムプロトコル（ＵＤＰ）等のその他のプロトコルに準拠することができる。 In some aspects, for bi-directional communication applications, the multimedia coding system 10 is based on Session Initiation Protocol (SIP), International Telecommunication Union Standardized Sector (ITU-T) H.264, 323 standard, ITU-T H.264. Video telephony or video streaming may be supported according to the 324 standard or other criteria. For one-way or two-way communication, the encoding device 12 may use video compression standards such as Moving Picture Experts Group (MPEG) -2, MPEG-4, ITU-T H.264. H.263 or ITU-H.MP that supports MPEG-4, Part 10, and advanced video coding (AVC). H.264, multimedia data encoded according to H.264 can be generated. Although not shown in FIG. 1, encoding device 12 and decoding device 14 may be integrated with a speech encoder and decoder, respectively, for audio and video in a common data sequence or separate data sequences. Appropriate multiplexer-demultiplexer (MUX-DEMUX) modules to handle both encodings, or other hardware, firmware, or software may be included. If applicable, the MUX-DEMUX module is ITU-H. It can be compliant with other protocols such as the H.223 multiplexer protocol or User Datagram Protocol (UDP).

一定の側面においては、この開示は、技術基準ＴＩＡ−１０９９、Ａｕｇ．２００６（“ＦＬＯ仕様”）として発行された順方向リンク専用（ＦＬＯ）エアインタフェース仕様“Forward Link Only Air Interface Specification for Terrestrial Mobile Multimedia Multicast”（地上移動マルチメディアマルチキャストに関する順方向リンク専用エアインタフェース仕様）を用いて地上移動マルチメディアマルチキャスト（ＴＭ３）システムにおいてリアルタイムマルチメディアサービスを配送するためのエンハンストＨ．２６４映像コーディングへの適用を企図する。しかしながら、この開示において説明されるコーディングコスト推定技法は、特定の型のブロードキャスト、マルチキャスト、ユニキャスト、又はポイント・ツー・ポイントシステムに限定されない。 In certain aspects, this disclosure is disclosed in technical standard TIA-1099, Aug. Forward Link Only Air Interface Specification for Terrestrial Mobile Multimedia Multicast issued as 2006 (“FLO Specification”) Enhanced H.264 for delivering real-time multimedia services in terrestrial mobile multimedia multicast (TM3) systems Application to H.264 video coding is contemplated. However, the coding cost estimation techniques described in this disclosure are not limited to a particular type of broadcast, multicast, unicast, or point-to-point system.

図１に示されるように、符号化デバイス１２は、符号化モジュール１８と、送信機２０と、を含む。符号化モジュール１８は、映像符号化の場合は１つ以上のデータフレームを含むことができる１つ以上の入力マルチメディアシーケンスを受信し、受信されたマルチメディアシーケンスのフレームを選択的に符号化する。符号化モジュール１８は、入力されたマルチメディアシーケンスを１つ以上のソース（図１には示されない）から受信する。幾つかの側面においては、符号化モジュール１８は、例えば衛星を介して１つ以上の映像コンテンツプロバイダから入力マルチメディアシーケンスを受信することができる。他の例として、符号化モジュール１８は、符号化デバイス１２内に組み入れられるか又は符号化デバイス１２に結合された画像キャプチャデバイス（図１には示されない）からマルチメディアシーケンスを受け取ることができる。代替として、符号化モジュール１８は、符号化デバイス１２内の又は符号化デバイス１２に結合されたメモリ又はアーカイブ（図１には示されない）からマルチメディアシーケンスを受け取ることができる。マルチメディアシーケンスは、コーディングされてブロードキャストとして又はオンデマンドで送信されるライブのリアルタイムの又はほぼリアルタイムの映像、音声、又は映像と音声のシーケンスを備えることができ、又はコーディングしてブロードキャストとして又はオンデマンドで送信するために予め記録されて格納された映像、音声、又は映像と音声を備えることができる。幾つかの側面においては、マルチメディアシーケンスの少なくとも一部分は、例えばゲームプレイにおける場合のようにコンピュータによって生成することができる。 As shown in FIG. 1, the encoding device 12 includes an encoding module 18 and a transmitter 20. The encoding module 18 receives one or more input multimedia sequences that can include one or more data frames in the case of video encoding and selectively encodes the frames of the received multimedia sequences. . Encoding module 18 receives input multimedia sequences from one or more sources (not shown in FIG. 1). In some aspects, the encoding module 18 may receive input multimedia sequences from one or more video content providers, eg, via satellite. As another example, encoding module 18 may receive a multimedia sequence from an image capture device (not shown in FIG. 1) that is incorporated into or coupled to encoding device 12. Alternatively, encoding module 18 may receive a multimedia sequence from a memory or archive (not shown in FIG. 1) within or coupled to encoding device 12. The multimedia sequence can comprise live real-time or near real-time video, audio, or video and audio sequences that are coded and transmitted as broadcast or on demand, or can be coded as broadcast or on demand. Video, audio, or video and audio pre-recorded and stored for transmission. In some aspects, at least a portion of the multimedia sequence can be generated by a computer, such as in game play.

いずれの場合も、符号化モジュール１８は、複数のフレームを符号化して複数のコーディングされたフレームを送信機２０を介して復号デバイス１４に送信する。符号化モジュール１８は、入力されたマルチメディアシーケンスのフレームをフレーム内コーディングされたフレーム、フレーム間コーディングされたフレーム又はその組み合わせとして符号化することができる。フレーム内コーディング技法を用いて符号化されるフレームは、その他のフレームを基準にせずにコーディングされ、イントラ（“Ｉ”）フレームとしばしば呼ばれる。フレーム間コーディング技法を用いて符号化されるフレームは、１つ以上のその他のフレームを基準にしてコーディングされる。フレーム間コーディングされたフレームは、１つ以上の予測“Ｐ”フレーム、両方向（“Ｂ”）フレーム、又はその組み合わせを含むことができる。Ｐフレームは、少なくとも１つの時間的に前のフレームを基準にして符号化され、Ｂフレームは、少なくとも１つの時間的に後のフレームを基準にして符号化される。幾つかの場合においては、Ｂフレームは、少なくとも１つの時間的に後のフレーム及び少なくとも１つの時間的に前のフレームを基準にして符号化することができる。 In any case, the encoding module 18 encodes the plurality of frames and transmits the plurality of coded frames to the decoding device 14 via the transmitter 20. Encoding module 18 may encode the frame of the input multimedia sequence as an intra-coded frame, an inter-coded frame, or a combination thereof. Frames encoded using intra-frame coding techniques are coded without reference to other frames and are often referred to as intra ("I") frames. Frames that are encoded using interframe coding techniques are coded with respect to one or more other frames. An inter-frame coded frame may include one or more predicted “P” frames, bi-directional (“B”) frames, or combinations thereof. The P frame is encoded with reference to at least one temporally previous frame, and the B frame is encoded with reference to at least one temporally subsequent frame. In some cases, a B frame may be encoded with reference to at least one temporally subsequent frame and at least one temporally previous frame.

符号化モジュール１８は、フレームを複数のブロックに分割してこれらのブロックの各々を別々に符号化するようにさらに構成することができる。一例として、符号化モジュール１８は、複数の１６×１６ブロックにフレームを分割することができる。幾つかのブロックは、“マクロブロック”としばしば呼ばれ、小分割ブロック（ここでは“サブブロック”としばしば呼ばれる）から成るグループを備える。一例として、１６×１６マクロブロックは、４つの８×８サブブロック、又はその他の小分割ブロックを備えることができる。例えば、Ｈ．２６４基準は、様々な異なるサイズ、例えば１６×１６、１６×８、８×１６、８×８、４×４、８×４、４×８、を有するブロックの符号化を可能にする。さらに、拡大として、あらゆるサイズのサブブロック、例えば、２×１６、１６×２、２×２、４×１６、及び８×２、をマクロブロック内に含めることができる。従って、符号化モジュール１８は、フレームを幾つかのブロックに分割し及び画素ブロックの各々をフレーム内コーディングされたブロック又はフレーム間コーディングされたブロックとして符号化するように構成することができ、これらの各々を一般的にブロックと呼ぶことができる。 Encoding module 18 may be further configured to divide the frame into a plurality of blocks and encode each of these blocks separately. As an example, encoding module 18 may divide a frame into a plurality of 16 × 16 blocks. Some blocks are often referred to as “macroblocks” and comprise groups of subdivision blocks (often referred to herein as “subblocks”). As an example, a 16 × 16 macroblock can comprise four 8 × 8 sub-blocks, or other subdivision blocks. For example, H.M. The H.264 standard allows for the encoding of blocks having a variety of different sizes, eg 16 × 16, 16 × 8, 8 × 16, 8 × 8, 4 × 4, 8 × 4, 4 × 8. Furthermore, as an extension, sub-blocks of any size, eg 2 × 16, 16 × 2, 2 × 2, 4 × 16, and 8 × 2, can be included in the macroblock. Accordingly, encoding module 18 may be configured to divide a frame into several blocks and encode each of the pixel blocks as an intra-coded block or an inter-coded block. Each can be generally referred to as a block.

符号化モジュール１８は、複数のコーディングモードをサポートすることができる。これらのモードの各々は、ブロックサイズ及びコーディング技法の異なる組み合わせに対応することができる。例えばＨ．２６４基準の場合は、７つのインターモード及び１３のイントラモードが存在する。７つの可変ブロックサイズインターモードは、ＳＫＩＰモードと、１６×１６モードと、１６×８モードと、８×１６モードと、８×８モードと、８×４モードと、４×８モードと、４×４モードと、を含む。１３のイントラモードは、９つの可能な補間方向が存在するＩＮＴＲＡ４×４モードと、４つの可能な補間方向が存在するＩＮＴＲＡ１６×１６モードと、を含む。 Encoding module 18 may support multiple coding modes. Each of these modes can correspond to a different combination of block size and coding technique. For example, H.C. In the case of the H.264 standard, there are 7 inter modes and 13 intra modes. The seven variable block size inter modes are SKIP mode, 16 × 16 mode, 16 × 8 mode, 8 × 16 mode, 8 × 8 mode, 8 × 4 mode, 4 × 8 mode, 4 X4 mode. The 13 intra modes include an INTRA 4 × 4 mode with 9 possible interpolation directions and an INTRA 16 × 16 mode with 4 possible interpolation directions.

高い圧縮効率を提供するために、この開示の様々な側面により、符号化モジュール１８は、ブロックのデータを高い効率でコーディングするモードを選択するのを試みる。この目的のために、符号化モジュール１８は、各々のブロックに関して、全モードの少なくとも一部に関するコーディングコストを推定する。符号化モジュール１８は、コーディングコストをレート及び歪みの関数として推定する。ここにおいて説明される技法により、符号化モジュール１８は、レートメトリック及び歪みメトリックを決定するためにブロックを実際にコーディングせずにモードに関するコーディングコストを推定する。この方法により、符号化モジュール１８は、各モードに関するブロックのデータの計算が複雑なコーディングを行うことなしに少なくともコーディングコストに基づいてモードのうちの１つを選択することができる。従来のモード選択は、いずれのモードを選択すべきかを決定するために各々のモードを用いたデータの実際のコーディングを要求する。従って、これらの技法は、各々のモードに関してデータを実際にコーディングせずにコーディングコストに基づいてモードを選択することによって時間と計算資源を節約する。実際、幾つかの側面においては、符号化モジュール１８は、各モードに関してブロックのデータを量子化せずにモードに関するコーディングコストを推定することができる。この方法により、この開示のコーディングコスト推定技法は、有効なモード選択を行うために必要な計算集約型の計算量を低減させる。 In order to provide high compression efficiency, according to various aspects of this disclosure, encoding module 18 attempts to select a mode for coding block data with high efficiency. For this purpose, the encoding module 18 estimates the coding cost for at least part of all modes for each block. Encoding module 18 estimates the coding cost as a function of rate and distortion. With the techniques described herein, encoding module 18 estimates the coding cost for a mode without actually coding the block to determine the rate and distortion metrics. In this way, the encoding module 18 can select one of the modes based at least on the coding cost without performing complex coding of the block data for each mode. Conventional mode selection requires actual coding of the data with each mode to determine which mode to select. Thus, these techniques save time and computational resources by selecting modes based on coding costs without actually coding the data for each mode. Indeed, in some aspects, the encoding module 18 can estimate the coding cost for a mode without quantizing the block's data for each mode. By this method, the disclosed coding cost estimation technique reduces the computationally intensive amount of computation required to make an effective mode selection.

符号化デバイス１２は、選択されたモードを適用してフレームのブロックをコーディングし、コーディングされたデータフレームを送信機２０を介して送信する。送信機２０は、符号化されたマルチメディアを送信チャネル１６において送信するための適切なモデム及びドライバ回路ソフトウェア及び／又はファームウェアを含むことができる。無線用途に関して、送信機２６は、符号化されたマルチメディアデータを搬送する無線データを送信するためのＲＦ回路を含む。 Encoding device 12 applies the selected mode to code a block of frames and transmits the coded data frame via transmitter 20. The transmitter 20 may include appropriate modem and driver circuit software and / or firmware for transmitting the encoded multimedia on the transmission channel 16. For wireless applications, the transmitter 26 includes RF circuitry for transmitting wireless data that carries encoded multimedia data.

復号デバイス１４は、受信機２２と、復号モジュール２４と、を含む。復号デバイス１４は、受信機２２を介して符号化デバイス１２から符号化されたデータを受け取る。送信機２０と同様に、受信機２２は、符号化されたマルチメディアを送信チャネル１６において受信するための適切なモデム及びドライバ回路ソフトウェア及び／又はファームウェアを含むことができ、及び無線用途において符号化されマルチメディアデータを搬送する無線データを受信するためのＲＦ回路を含むことができる。復号モジュール２４は、受信機２２を介して受信されたコーディングされたデータフレームを復号する。復号デバイス１４は、復号デバイス１４内に組み入れられるか又は有線又は無線接続を介して復号デバイス１４に結合された個別デバイスとして提供することができるディスプレイ（示されていない）を介して、復号されたデータフレームをユーザーに対してさらに提示することができる。 The decoding device 14 includes a receiver 22 and a decoding module 24. The decoding device 14 receives the encoded data from the encoding device 12 via the receiver 22. Similar to transmitter 20, receiver 22 may include appropriate modem and driver circuit software and / or firmware for receiving encoded multimedia on transmission channel 16, and encoding in wireless applications. And RF circuitry for receiving wireless data carrying multimedia data. The decoding module 24 decodes the coded data frame received via the receiver 22. The decoding device 14 is decoded via a display (not shown) that can be incorporated into the decoding device 14 or provided as a separate device coupled to the decoding device 14 via a wired or wireless connection. A data frame can be further presented to the user.

幾つかの例においては、符号化デバイス１２及び復号デバイス１４は、各々が、送信チャネル１６において送信される符号化されたマルチメディア及びその他の情報に関する送信デバイス及び受信デバイスの両方として働くことができるように可逆（ｒｅｃｉｐｒｏｃａｌ）送受信回路を各々含むことができる。この場合は、符号化デバイス１２及び復号デバイス１４の両方が、マルチメディアシーケンスを送信及び受信すること、従って双方向通信に参加することができる。換言すると、コーディングシステム１０の例示される構成要素は、符号器／復号器（ＣＯＤＥＣ）の一部として一体化することができる。 In some examples, encoding device 12 and decoding device 14 may each act as both a transmitting device and a receiving device for encoded multimedia and other information transmitted in transmission channel 16. As described above, each of them may include a reciprocal transmission / reception circuit. In this case, both the encoding device 12 and the decoding device 14 can transmit and receive multimedia sequences and thus participate in two-way communication. In other words, the illustrated components of coding system 10 may be integrated as part of an encoder / decoder (CODEC).

符号化デバイス１２及び復号デバイス１４内の構成要素は、ここにおいて説明される技法を実装するために利用可能な構成要素例である。しかしながら、符号化デバイス１２及び復号デバイス１４は、希望される場合は、数多くのその他の構成要素を含むことができる。例えば、符号化デバイス１２は、各々がここにおいて説明される技法より１つ以上のマルチメディアデータシーケンスを受信し及び各々のマルチメディアデータシーケンスを符号化する複数の符号化モジュールを含むことができる。この場合は、符号化デバイス１２は、データセグメントを送信のために結合する少なくとも１つのマルチプレクサをさらに含むことができる。さらに、符号化デバイス１２及び復号デバイス１４は、符号化された映像の送信及び受信のための適切な変調構成要素、復調構成要素、周波数変換構成要素、フィルタリング構成要素、及び増幅器構成要素を含むことができ、無線周波数（ＲＦ）無線構成要素とアンテナとを適宜含むことができる。しかしながら、例示を容易にするために、該構成要素は図１には示されていない。 The components in encoding device 12 and decoding device 14 are example components that can be used to implement the techniques described herein. However, encoding device 12 and decoding device 14 may include a number of other components if desired. For example, encoding device 12 may include multiple encoding modules that each receive one or more multimedia data sequences and encode each multimedia data sequence from the techniques described herein. In this case, encoding device 12 may further include at least one multiplexer that combines the data segments for transmission. Furthermore, the encoding device 12 and the decoding device 14 include appropriate modulation components, demodulation components, frequency conversion components, filtering components, and amplifier components for transmission and reception of encoded video. And may include radio frequency (RF) radio components and antennas as appropriate. However, for ease of illustration, the components are not shown in FIG.

図２は、典型的符号化モジュール３０をさらに詳細に示すブロック図である。符号化モジュール３０は、例えば、図１の符号化デバイス１２の符号化モジュール１８を代表することができる。図２に示されるように、符号化モジュール３０は、１つ以上のマルチメディアシーケンスの入力されたマルチメディアデータフレームを１つ以上のソースから受信し、受信されたマルチメディアシーケンスのフレームを処理する制御モジュール３２を含む。特に、制御モジュール３２は、マルチメディアシーケンスの着信フレームを解析し、フレームの解析に基づいてこれらの着信フレームを符号化すべきか又はスキップすべきかを決定する。幾つかの側面においては、符号化デバイス１２は、送信チャネル１６において帯域幅を保存するためにフレームスキップを用いることによってマルチメディアシーケンス内に含まれる情報を引き下げられたフレームレートで符号化することができる。 FIG. 2 is a block diagram illustrating an exemplary encoding module 30 in further detail. The encoding module 30 can represent, for example, the encoding module 18 of the encoding device 12 of FIG. As shown in FIG. 2, the encoding module 30 receives input multimedia data frames of one or more multimedia sequences from one or more sources and processes the frames of the received multimedia sequences. A control module 32 is included. In particular, the control module 32 analyzes incoming frames of the multimedia sequence and determines whether these incoming frames should be encoded or skipped based on the analysis of the frames. In some aspects, the encoding device 12 may encode information contained in the multimedia sequence at a reduced frame rate by using frame skip to conserve bandwidth in the transmission channel 16. it can.

さらに、符号化されることになる着信フレームに関して、制御モジュール３２は、これらのフレームを、Ｉフレーム、Ｐフレーム、又はＢフレームとして符号化すべきかどうかを決定するように構成することもできる。制御モジュール３２は、チャネル切り替えフレームとして用いるために、又はイントラリフレッシュフレームとして用いるために、マルチメディアシーケンス開始時に、又はシーケンス内でのシーン変化時に、着信フレームをＩフレームとして符号化するのを決定することができる。その他の場合は、制御モジュール３２は、フレームをコーディングすることと関連づけられた帯域幅量を小さくするためにフレーム間コーディングされたフレーム（すなわち、Ｐフレーム又はＢフレーム）としてフレームを符号化する。 Further, for incoming frames that are to be encoded, the control module 32 may be configured to determine whether these frames should be encoded as I-frames, P-frames, or B-frames. The control module 32 decides to encode the incoming frame as an I-frame at the start of the multimedia sequence or at a scene change in the sequence for use as a channel switch frame or as an intra-refresh frame. be able to. In other cases, the control module 32 encodes the frame as an inter-frame coded frame (ie, a P frame or a B frame) to reduce the amount of bandwidth associated with coding the frame.

制御モジュール３２は、フレームを複数のブロックに分割し、これらのブロックの各々に関するコーディングモード、例えば上述されるＨ．２６４コーディングモードのうちの１つ、を選択するようにさらに構成することができる。以下において詳細に説明されるように、符号化モジュール３０は、これらのコーディングモードのうちで最も効率的なコーディングモードを選択するのに役立つようにこれらのモードのうちの少なくとも一部に関するコーディングコストを推定することができる。ブロックのうちの１つをコーディングする際に用いるためのコーディングモードを選択後は、符号化モジュール３０は、ブロックに関する残差データを生成する。フレーム内コーディングの対象として選択されたブロックに関しては、空間予測モジュール３４は、ブロックに関する残差データを生成する。空間予測モジュール３４は、例えば、１つ以上の隣接ブロック及び選択されたフレーム内コーディングモードに対応する補間方向性を用いる補間を介してブロックの予測されたバージョンを生成することができる。これで、空間予測モジュール３４は、入力フレームのブロックと予測されたブロックとの間の差分を計算することができる。この差分は、残差データ又は残差係数と呼ばれる。 The control module 32 divides the frame into a plurality of blocks and a coding mode for each of these blocks, e.g. It can be further configured to select one of the H.264 coding modes. As described in detail below, encoding module 30 reduces the coding cost for at least some of these modes to help select the most efficient coding mode among these coding modes. Can be estimated. After selecting a coding mode for use in coding one of the blocks, encoding module 30 generates residual data for the block. For blocks selected for intraframe coding, the spatial prediction module 34 generates residual data for the blocks. Spatial prediction module 34 may generate a predicted version of the block via interpolation using, for example, one or more neighboring blocks and an interpolation direction corresponding to the selected intra-frame coding mode. The spatial prediction module 34 can now calculate the difference between the input frame block and the predicted block. This difference is called residual data or residual coefficient.

フレーム間コーディング対象として選択されたブロックに関しては、動き推定モジュール３６及び動き補償モジュール３８は、ブロックに関する残差データを生成する。特に、動き推定モジュール３６は、少なくとも１つの基準フレームを識別し、その基準フレーム内において入力フレーム内のブロックと最も良く一致するブロックを探す。動き推定モジュール３６は、入力フレーム内におけるブロックの位置と基準フレーム内における識別されたブロックの位置との間のオフセットを表すための動きベクトルを計算する。動き補償モジュール３８は、入力フレームのブロックと動きベクトルが指し示す基準フレーム内の識別されたブロックとの間の差分を計算する。この差分は、そのブロックに関する残差データと呼ばれる。 For the block selected for interframe coding, the motion estimation module 36 and motion compensation module 38 generate residual data for the block. In particular, motion estimation module 36 identifies at least one reference frame and looks for a block in the reference frame that best matches the block in the input frame. The motion estimation module 36 calculates a motion vector to represent the offset between the position of the block in the input frame and the position of the identified block in the reference frame. Motion compensation module 38 calculates the difference between the block of the input frame and the identified block in the reference frame pointed to by the motion vector. This difference is called residual data for that block.

符号化モジュール３０は、変換モジュール４０と、量子化モジュール４６と、エントロピー符号器４８と、も含む。変換モジュール４０は、変換関数に従ってブロックの残差データを変換する。幾つかの側面においては、変換モジュール４０は、残差データに関する変換係数を生成するために整数変換、例えば４×４又は８×８整数変換又は離散コサイン変換（ＤＣＴ）、を残差データに適用する。量子化モジュール４６は、変換係数を量子化し、量子化された変換係数をエントロピー符号器４８に提供する。エントロピー符号器４８は、コンテキスト適応型コーディング技法、例えばコンテキスト適応型可変長コーディング（ＣＡＶＬＣ）又はコンテキスト適応型バイナリ算術コーディング（ＣＡＢＡＣ）、等を用いて量子化された変換係数を符号化する。以下において詳細に説明されるように、エントロピー符号器４８は、選択されたモードを適用してブロックのデータをコーディングする。 The encoding module 30 also includes a transform module 40, a quantization module 46, and an entropy encoder 48. The conversion module 40 converts the residual data of the block according to the conversion function. In some aspects, transform module 40 applies an integer transform, such as a 4 × 4 or 8 × 8 integer transform or a discrete cosine transform (DCT), to the residual data to generate transform coefficients for the residual data. To do. The quantization module 46 quantizes the transform coefficients and provides the quantized transform coefficients to the entropy encoder 48. The entropy encoder 48 encodes the quantized transform coefficients using a context adaptive coding technique, such as context adaptive variable length coding (CAVLC) or context adaptive binary arithmetic coding (CABAC). As described in detail below, entropy encoder 48 applies the selected mode to code the data for the block.

エントロピー符号器４８は、ブロックと関連づけられた追加データを符号化することもできる。例えば、残差データに加えて、エントロピー符号器４８は、ブロックの１つ以上の動きベクトル、ブロックのコーディングモードを示す識別子、１つ以上の基準フレームインデックス、量子化パラメータ（ＱＰ）情報、ブロックのスライス情報、等を符号化することができる。エントロピー符号器４８は、符号化モジュール３０内のその他のモジュールからこの追加ブロックデータを受け取ることができる。例えば、動きベクトル情報は、動き推定モジュール３６から受け取ることができ、ブロックモード情報は、制御モジュール３２から受け取ることができる。幾つかの側面においては、エントロピー符号器４８は、固定長コーディング（ＦＬＣ）技法又はユニバーサル可変長コーディング（ＶＬＣ）技法、例えば指数−ゴロムコーディング（“Ｅｘｐ−Ｇｏｌｏｍｂ”）、を用いてこの追加情報の少なくとも一部分をコーディングすることができる。代替として、エントロピー符号器４８は、上述されるコンテキスト適応型コーディング技法、すなわち、ＣＡＢＡＣ又はＣＡＶＬＣ、を用いて追加のブロックデータの一部分を符号化することができる。 Entropy encoder 48 may also encode additional data associated with the block. For example, in addition to residual data, the entropy encoder 48 may include one or more motion vectors of the block, an identifier indicating the coding mode of the block, one or more reference frame indexes, quantization parameter (QP) information, Slice information, etc. can be encoded. Entropy encoder 48 may receive this additional block data from other modules within encoding module 30. For example, motion vector information can be received from the motion estimation module 36 and block mode information can be received from the control module 32. In some aspects, the entropy encoder 48 uses fixed length coding (FLC) techniques or universal variable length coding (VLC) techniques, such as exponential-Golomb coding (“Exp-Golomb”). At least a portion can be coded. Alternatively, entropy encoder 48 may encode a portion of the additional block data using the context adaptive coding technique described above, ie, CABAC or CAVLC.

制御モジュール３２がブロックに関するモードを選択するのを援助するために、制御モジュール３２は、可能なモードのうちの少なくとも一部に関するコーディングコストを推定する。一定の側面においては、制御モジュール３２は、可能なコーディングモードの各々におけるブロックをコーディングするコストを推定することができる。コストは、例えば、所定のモードにおいてブロックをコーディングすることと関連づけられたビット数対そのモードにおいて生じる歪み量に関して推定することができる。例えばＨ．２６４基準の場合は、制御モジュール３２は、フレーム間コーディング用に選択されたブロックに関しては２２の異なるコーディングモード（フレーム間及びフレーム内コーディングモード）及びフレーム内コーディング用に選択されたブロックに関しては１３の異なるコーディングモードに関するコーディングコストを推定することができる。その他の側面においては、制御モジュール３２は、他のモード選択技法を用いて最初に可能なモードの組を減らし、次にこの開示の技法を利用してその組の残りのモードに関するコーディングコストを推定することができる。換言すると、幾つかの側面においては、制御モジュール３２は、コスト推定技法を適用する前にモードの可能性の数を絞ることができる。有利なことに、符号化モジュール３０は、異なるモードに関するブロックのデータを実際にコーディングせずにモードに関するコーディングコストを推定し、それにより、コーディング決定に関連する計算上のオーバーヘッドを低減する。実際、図２に示される例においては、符号化モジュール３０は、異なるモードに関するブロックのデータを量子化せずにコーディングコストを推定することができる。この方法により、この開示のコーディングコスト推定技法は、コーディングコストを計算するために必要な計算集約型の計算量を低減させる。特に、モードのうちの１つを選択するために様々なコーディングモードを用いてブロックを符号化する必要がない。 To assist the control module 32 in selecting a mode for the block, the control module 32 estimates the coding cost for at least some of the possible modes. In certain aspects, the control module 32 can estimate the cost of coding a block in each of the possible coding modes. Cost can be estimated, for example, with respect to the number of bits associated with coding a block in a given mode versus the amount of distortion that occurs in that mode. For example, H.C. For the H.264 standard, the control module 32 determines that there are 22 different coding modes (interframe and intraframe coding modes) for blocks selected for interframe coding and 13 for blocks selected for intraframe coding. Coding costs for different coding modes can be estimated. In other aspects, the control module 32 first reduces the set of possible modes using other mode selection techniques and then uses the techniques of this disclosure to estimate the coding cost for the remaining modes of the set. can do. In other words, in some aspects, the control module 32 may narrow the number of mode possibilities before applying the cost estimation technique. Advantageously, the encoding module 30 estimates the coding cost for a mode without actually coding the data for blocks for different modes, thereby reducing the computational overhead associated with coding decisions. In fact, in the example shown in FIG. 2, the encoding module 30 can estimate the coding cost without quantizing the block data for different modes. By this method, the disclosed coding cost estimation technique reduces the computationally intensive amount of computation required to calculate the coding cost. In particular, it is not necessary to encode the block using various coding modes to select one of the modes.

ここにおいてさらに詳細に説明されるように、制御モジュール３２は、以下の方程式に従って各々の解析されたモードのコーディングコストを推定する。

As described in further detail herein, control module 32 estimates the coding cost of each analyzed mode according to the following equation:

ここで、Ｊは、推定されたコーディングコストであり、Ｄは、ブロックの歪みメトリックであり、λｍｏｄｅは、各々のモードのラグランジュ乗数であり、Ｒは、ブロックのレートメトリックである。歪みメトリック（Ｄ）は、例えば、差分の絶対値の和（ＳＡＤ）と、差分の二乗の和（ＳＳＤ）と、変換差分の絶対値の和（ＳＡＴＤ）と、変換差分の二乗の和（ＳＳＴＤ）、と、等を備えることができる。レートメトリック（Ｒ）は、例えば、所定のブロックにおけるデータをコーディングすることと関連づけられたビット数であることができる。上述されるように、異なるコーディング技法を用いて異なるタイプのブロックデータをコーディングすることができる。従って、方程式（１）は、以下の形に書き換えることができる。

Where J is the estimated coding cost, D is the block distortion metric, λmode is the Lagrange multiplier for each mode, and R is the block rate metric. The distortion metric (D) includes, for example, the sum of absolute values of differences (SAD), the sum of squares of differences (SSD), the sum of absolute values of conversion differences (SATD), and the sum of squares of conversion differences (SSTD). ), And the like. The rate metric (R) can be, for example, the number of bits associated with coding data in a given block. As described above, different types of block data can be coded using different coding techniques. Therefore, equation (1) can be rewritten as:

ここで、Ｒ_{ｃｏｎｔｅｘｔ}は、コンテキスト適応型コーディング技法を用いてコーディングされるブロックデータに関するレートメトリックを表し、Ｒ_{ｎｏｎ＿ｃｏｎｔｅｘｔ}は、非コンテキスト適応型コーディング技法を用いてコーディングされるブロックデータに関するレートメトリックを表す。例えば、Ｈ．２６４基準においては、残差データは、ＣＡＶＬＣ又はＣＡＢＡＣ等のコンテキスト適応型コーディングを用いてコーディングすることができる。その他のブロックデータ、例えば動きベクトル、ブロックモード、等は、ＦＬＣ又はユニバーサルＶＬＣ技法、例えばＥｘｐ−Ｇｏｌｏｍｂを用いてコーディングすることができる。この場合は、方程式（２）は、以下の形に書き換えることができる。

Here, R _context represents a rate metric for block data coded using a context adaptive coding technique, and R _{non_context} represents a rate metric for block data coded using a non-context adaptive coding technique. For example, H.M. In the H.264 standard, residual data can be coded using context adaptive coding such as CAVLC or CABAC. Other block data such as motion vectors, block modes, etc. can be coded using FLC or universal VLC techniques such as Exp-Golomb. In this case, equation (2) can be rewritten as:

ここで、Ｒｒｅｓｉｄｕａｌは、コンテキスト適応型コーディング技法を用いて残差データをコーディングするためのレートメトリック、例えば残差データをコーディングすることと関連づけられたビット数、を表し、Ｒｏｔｈｅｒは、ＦＬＣ又はユニバーサルＶＬＣ技法を用いてその他のブロックデータをコーディンするためのレートメトリック、例えばその他のブロックデータをコーディングすることと関連づけられたビット数、を表す。 Where Rresidual represents a rate metric for coding residual data using context adaptive coding techniques, eg, the number of bits associated with coding the residual data, and Rother is FLC or Universal VLC. Represents a rate metric for coding other block data using techniques, eg, the number of bits associated with coding the other block data.

推定されたコーディングコスト（Ｊ）を計算する際には、符号化モジュール３０は、ＦＬＣ又はユニバーサルＶＬＣ、すなわちＲ_{ｏｔｈｅｒ}を用いてブロックデータをコーディングすることと関連づけられたビット数を相対的に簡単に決定することができる。符号化モジュール３０は、例えば、ＦＬＣ又はユニバーサルＶＬＣを用いてブロックデータをコーディングすることと関連づけられたビット数を特定するために符号テーブルを用いることができる。符号テーブルは、例えば、複数の符号語と、その符号語をコーディングすることと関連づけられたビット数と、を含むことができる。しかしながら、残差データ（Ｒ_{ｒｅｓｉｄｕａｌ}）をコーディングすることと関連づけられたビット数を決定することは、データのコンテキストの関数としてのコンテキスト適応型コーディングは適応型の性質を有することに起因してはるかに困難なタスクとなる。残差データをコーディングすること関連づけられたビットの正確な数、又はどのようなデータがコンテキスト適応型コーディング中であるかを決定するために、符号化モジュール３０は、残差データを変換し、変換された残差データを量子化し及び変換−量子化された残差データを符号化しなければならない。しかしながら、この開示の技法により、ビット推定モジュール４２は、残差データを実際にコーディングせずにコンテキスト適応型コーディング技法を用いて残差データをコーディングすることと関連づけられたビット数を推定することができる。 In calculating the estimated coding cost (J), the encoding module 30 can relatively easily calculate the number of bits associated with coding block data using FLC or universal VLC, ie, R _other. Can be determined. Encoding module 30 may use a code table to identify the number of bits associated with coding block data using, for example, FLC or universal VLC. The code table can include, for example, a plurality of codewords and the number of bits associated with coding the codeword. However, determining the number of bits associated with coding the _residual data (R _residual ) is much more due to the fact that context adaptive coding as a function of the context of the data has an adaptive nature. It becomes a difficult task. Coding the residual data To determine the exact number of associated bits, or what data is in context adaptive coding, the encoding module 30 transforms the residual data and converts The quantized residual data must be quantized and the transformed-quantized residual data must be encoded. However, according to the techniques of this disclosure, bit estimation module 42 may estimate the number of bits associated with coding residual data using context adaptive coding techniques without actually coding the residual data. it can.

図２に示される例においては、ビット推定モジュール４２は、残差データに関する変換係数を用いて残差データをコーディングすることと関連づけられたビット数を推定する。従って、解析されるべき各モードに関して、符号化モジュール３０は、残差モードをコーディングすることと関連づけられたビット数を推定するために残差データに関する変換係数を計算するだけでよい。従って、符号化モジュール３０は、各々のモードに関して変換係数を量子化せず及び量子化された変換係数を符号化しないことによって資源の計算量を低減させ及び残差データをコーディングすることと関連づけられたビット数を決定するために要する時間を短縮する。 In the example shown in FIG. 2, bit estimation module 42 estimates the number of bits associated with coding the residual data using transform coefficients for the residual data. Thus, for each mode to be analyzed, encoding module 30 need only calculate a transform coefficient for the residual data to estimate the number of bits associated with coding the residual mode. Thus, the encoding module 30 is associated with reducing resource complexity and coding residual data by not quantizing the transform coefficients for each mode and not encoding the quantized transform coefficients. Reduce the time required to determine the number of bits.

ビット推定モジュール４２は、変換モジュール４０によって出力された変換係数を解析し、量子化後にゼロでないままになる１つ以上の変換係数を識別する。特に、ビット推定モジュール４２は、変換係数の各々を対応するしきい値と比較する。幾つかの側面においては、対応するしきい値は、符号化モジュール３０のＱＰの関数として計算することができる。ビット推定モジュール４２は、対応するしきい値よりも大きいか又は同じである変換係数を、量子化後にゼロでないままである変換係数として識別する。 Bit estimation module 42 analyzes the transform coefficients output by transform module 40 and identifies one or more transform coefficients that remain non-zero after quantization. In particular, bit estimation module 42 compares each of the transform coefficients with a corresponding threshold value. In some aspects, the corresponding threshold can be calculated as a function of the QP of the encoding module 30. Bit estimation module 42 identifies transform coefficients that are greater than or equal to the corresponding threshold as transform coefficients that remain non-zero after quantization.

ビット推定モジュール４２は、量子化後にゼロでないままであるとして識別された少なくとも変換係数に基づいて残差データをコーディングすることと関連づけられたビット数を推定する。特に、ビット推定モジュール４２は、量子化の影響を受けないゼロでない変換係数の数を決定する。ビット推定モジュール４２は、量子化の影響を受けないとして識別された変換係数の絶対値の少なくとも一部を合計する。次に、ビット推定モジュール４２は、以下の方程式を用いて、残差データに関するレートメトリック、すなわち、残差データをコーディンすることと関連づけられたビット数、を推定する。

Bit estimation module 42 estimates the number of bits associated with coding the residual data based on at least the transform coefficients identified as remaining non-zero after quantization. In particular, the bit estimation module 42 determines the number of non-zero transform coefficients that are not affected by quantization. Bit estimation module 42 sums at least some of the absolute values of the transform coefficients identified as unaffected by quantization. The bit estimation module 42 then estimates the rate metric for the residual data, i.e., the number of bits associated with coding the residual data, using the following equation:

ここで、ＳＡＴＤは、量子化の影響を受けないことが予測されるゼロでない変換係数の絶対値の少なくとも一部の和であり、ＮＺ_ｅｓｔは、量子化の影響を受けないことが予測されるゼロでない変換係数の推定数であり、ａ_１、ａ_２、及びａ_３は、係数である。係数ａ_１、ａ_２、及びａ_３は、例えば、最小二乗推定を用いて計算することができる。変換係数の和は、方程式例（４）における変換差分の絶対値の和ＳＡＴＤであるが、その他の差分係数、例えばＳＳＴＤ、を用いることができる。 Here, SATD is the sum of at least part of absolute values of non-zero transform coefficients that are predicted to be unaffected by quantization, and NZ _est is predicted to be unaffected by quantization. It is an estimated number of non-zero transform coefficients, and a ₁ , a ₂ , and a ₃ are coefficients. The coefficients a ₁ , a ₂ , and a ₃ can be calculated using, for example, least squares estimation. The sum of the conversion coefficients is the sum SATD of the absolute values of the conversion differences in the equation example (4), but other difference coefficients such as SSTD can be used.

４×４ブロックに関するＲ_{ｒｅｓｉｄｕａｌ}の計算例が以下に示される。異なるサイズのブロックに関しても同様の計算を行うことができる。符号化モジュール３０は、残差データに関する変換係数の行列を計算する。変換係数の典型的行列が以下に示される。

An example of calculating _{Rresidual for} a 4x4 block is shown below. Similar calculations can be performed for blocks of different sizes. The encoding module 30 calculates a matrix of transform coefficients for the residual data. A typical matrix of transform coefficients is shown below.

変換係数行列（Ａ）の行数は、ブロック内の画素の行数と等しく、変換係数行列の列数は、ブロック内の画素の列数と等しい。従って、上例においては、変換係数行列の次元は、４×４ブロックに対応するために４×４である。変換係数行列のエントリＡ（ｉ，ｊ）の各々は、各々の残差係数の変換である。 The number of rows of the transform coefficient matrix (A) is equal to the number of rows of pixels in the block, and the number of columns of the transform coefficient matrix is equal to the number of columns of pixels in the block. Therefore, in the above example, the dimension of the transform coefficient matrix is 4 × 4 to correspond to 4 × 4 blocks. Each entry A (i, j) in the transform coefficient matrix is a transform of the respective residual coefficient.

量子化中に、行列Ａのうちのより小さい値を有する変換係数は、量子化後にゼロになる傾向がある。従って、符号化モジュール３０は、残差変換係数行列Ａをしきい値行列と比較し、行列Ａのいずれの変換係数が量子化後にゼロでないままであるかを予測する。典型的しきい値行列が以下に示される。

During quantization, transform coefficients with smaller values in matrix A tend to be zero after quantization. Accordingly, the encoding module 30 compares the residual transform coefficient matrix A with a threshold matrix and predicts which transform coefficients of the matrix A remain non-zero after quantization. A typical threshold matrix is shown below.

行列Ｃは、ＱＰ値の関数として計算することができる。行列Ｃの次元は、行列Ａの次元と同じである。例えばＨ．２６４基準の場合は、行列Ｃのエントリは、以下の方程式に基づいて計算することができる。

The matrix C can be calculated as a function of the QP value. The dimension of the matrix C is the same as the dimension of the matrix A. For example, H.C. For the H.264 standard, the entries in matrix C can be calculated based on the following equation:

ここで、ＱＢＩＴＳ｛ＱＰ｝は、スケーリングをＱＰの関数として決定するパラメータであり、Ｌｅｖｅｌ＿Ｏｆｆｓｅｔ（ｉ，ｊ）｛ＱＰ｝は、行列の行ｉ及び列ｊにおけるエントリに関するデッドゾーンパラメータであり、ＱＰの関数でもあり、Ｌｅｖｅｌ＿Ｓｃａｌｅ（ｉ，ｊ）｛ＱＰ｝は、行列の行ｉ及び列ｊにおけるエントリに関する乗算係数であり、ＱＰの関数でもあり、ｉは行列の行に対応し、ｊは行列の列に対応し、ＱＰは、符号化モジュール３０の量子化パラメータに対応する。方程式例（５）においては、変数は、Ｈ．２６４コーディング基準においては演算ＱＰの関数として定義することができる。 Where QBITS {QP} is a parameter that determines scaling as a function of QP, Level_Offset (i, j) {QP} is a dead zone parameter for entries in row i and column j of the matrix, and QP Is also a function, Level_Scale (i, j) {QP} is a multiplication factor for entries in row i and column j of the matrix, is also a function of QP, i corresponds to a row of the matrix, and j is a column of the matrix QP corresponds to the quantization parameter of the encoding module 30. In example equation (5), the variable is H.264. In the H.264 coding standard, it can be defined as a function of the operation QP.

これらの変数のうちのいずれの変数が量子化後も存在するかを決定するためにその他の方程式を用いることができ、その他のコーディング基準においてはその特定の基準によって採用される量子化法に基づいて定義することができる。幾つかの側面においては、符号化モジュール３０は、ＱＰ値範囲内において動作するように構成することができる。この場合は、符号化モジュール３０は、ＱＰ値範囲内の各々のＱＰ値に対応する複数の比較行列を予め計算することができる。符号化モジュール３０は、変換係数行列と比較するために符号化モジュール３０のＱＰに対応する比較行列を選択する。 Other equations can be used to determine which of these variables exist after quantization, and other coding criteria are based on the quantization method employed by that particular criterion. Can be defined. In some aspects, the encoding module 30 can be configured to operate within a QP value range. In this case, the encoding module 30 can previously calculate a plurality of comparison matrices corresponding to each QP value within the QP value range. The encoding module 30 selects a comparison matrix corresponding to the QP of the encoding module 30 for comparison with the transform coefficient matrix.

変換係数行列Ａとしきい値行列Ｃとの間の比較結果は、１とゼロの行列である。上例においては、この比較は、以下に示される１とゼロの行列になる。

The comparison result between the transform coefficient matrix A and the threshold matrix C is a matrix of 1s and zeros. In the above example, this comparison is a matrix of ones and zeros as shown below.

ここで、１は、量子化の影響を受けない見込みである、すなわちゼロでないままである見込みであるとして識別された変換係数の位置を表し、ゼロは、量子化の影響を受ける見込みである、すなわちゼロになる見込みである変換係数の位置を表す。上述されるように、変換係数は、行列Ａの変換係数の絶対値が行列Ｃの対応するしきい値よりも大きいか又は同じであるときにゼロでないままである見込みであるとして識別される。 Where 1 represents the position of the transform coefficient identified as likely to be unaffected by quantization, i.e., likely to remain non-zero, and zero is likely to be affected by quantization. That is, it represents the position of the transform coefficient that is expected to be zero. As described above, a transform coefficient is identified as likely to remain non-zero when the absolute value of the transform coefficient of matrix A is greater than or equal to the corresponding threshold value of matrix C.

結果的に得られた１とゼロの行列を用いて、ビット推定モジュール４２は、量子化の影響を受けない変換係数の数を決定する。換言すると、ビット推定モジュール４２は、量子化後もゼロでないままであるとして識別された変換係数の数を決定する。ビッ推定モジュール４２は、以下の方程式に従って量子化後にゼロでないままであるとして識別された変換係数の数を決定する。

Using the resulting 1 and zero matrix, the bit estimation module 42 determines the number of transform coefficients that are not affected by quantization. In other words, the bit estimation module 42 determines the number of transform coefficients identified as remaining non-zero after quantization. Bit estimation module 42 determines the number of transform coefficients identified as remaining non-zero after quantization according to the following equation:

ここで、ＮＺ_ｅｓｔは、ゼロでない変換係数の推定数であり、Ｍ（ｉ，ｊ）は、行ｉ及び列ｊにおける行列Ｍの値である。上例においては、ＮＺ_ｅｓｔは、８に等しい。 Here, NZ _est is the estimated number of non-zero transform coefficients, and M (i, j) is the value of the matrix M in row i and column j. In the above example, NZ _est is equal to 8.

ビット推定モジュール４２は、量子化の影響を受けないことが推定される変換係数の絶対値の少なくとも一部の和も計算する。一定の側面においては、ビット推定モジュール４２は、以下の方程式に従って変換係数の絶対値の少なくとも一部の和を計算することができる。

The bit estimation module 42 also calculates the sum of at least some of the absolute values of the transform coefficients estimated to be unaffected by quantization. In certain aspects, the bit estimation module 42 may calculate a sum of at least some of the absolute values of the transform coefficients according to the following equation:

ここで、ＳＡＴＤは、量子化後にゼロでないままであるとして識別された変換係数の総和であり、Ｍ（ｉ，ｊ）は、行ｉ及び列ｊにおける行列Ｍの値であり、Ａ（ｉ，ｊ）は、行ｉ及び列ｊにおける行列Ａの値であり、ａｂｓ（ｘ）は、ｘの絶対値を計算する絶対値関数である。上述される例においては、ＳＡＴＤは、２３６１に等しい。その他の差分メトリック、例えばＳＳＴＤ、も変換係数に関して用いることができる。 Where SATD is the sum of the transform coefficients identified as remaining non-zero after quantization, M (i, j) is the value of matrix M in row i and column j, and A (i, j j) is the value of matrix A in row i and column j, and abs (x) is an absolute value function that calculates the absolute value of x. In the example described above, SATD is equal to 2361. Other difference metrics, such as SSTD, can also be used for transform coefficients.

これらの値を用いて、ビット推定モジュール４２は、上記の方程式（３）を用いて残差係数をコーディングすることと関連づけられたビット数を概算する。制御モジュール３２は、Ｒ_{ｒｅｓｉｄｕａｌ}の推定値を用いてモードの総コーディングコストの推定値を計算することができる。符号化モジュール３０は、１つ以上のその他の可能なモードに関する総コーディングコストを同じ方法で推定し、最小のコーディングコストを有するモードを選択することができる。次に、符号化モジュール３０は、選択されたコーディングモードを適用してフレームのブロック又はブロック（複数）をコーディングする。 Using these values, bit estimation module 42 approximates the number of bits associated with coding the residual coefficient using equation (3) above. The control module 32 can calculate an estimate of the total coding cost of the mode using the estimate of R _residual . Encoding module 30 may estimate the total coding cost for one or more other possible modes in the same way and select the mode with the lowest coding cost. Encoding module 30 then applies the selected coding mode to code the block or blocks of the frame.

上記の技法は、符号化デバイス１２内に個々に実装することができ、又は、２つ以上又はすべての技法をまとめて実装することができる。符号化モジュール３０内の構成要素は、ここにおいて説明される技法を実装するために適用可能な構成要素の典型的例である。しかしながら、符号化モジュール３０は、希望される場合はその他の数多くの構成要素を含むこと、及び上述されるモジュールのうちの１つ以上のモジュールの機能を結合したより少ない数の構成要素を含むことができる。符号化モジュール３０内の構成要素は、１つ以上のプロセッサ、デジタル信号プロセッサ、特定用途向け集積回路（ＡＳＩＣ）、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、ディスクリートロジック、ソフトウェア、ハードウェア、ファームウェア、又はそのいずれかの組み合わせとして実装することができる。異なる特徴をモジュールとして描写することは、符号化モジュール３０の異なる機能上の側面を強調することが意図されており、該モジュールを別個のハードウェア又はソフトウェア構成要素によって実現しなければならないということは必ずしも意味しない。むしろ、１つ以上のモジュールと関連づけられた機能は、共通の又は別個のハードウェア又はソフトウェア構成要素内に組み入れることができる。 The techniques described above can be implemented individually within encoding device 12, or two or more or all techniques can be implemented together. The components in encoding module 30 are typical examples of components that can be applied to implement the techniques described herein. However, the encoding module 30 includes many other components if desired, and includes a smaller number of components that combine the functions of one or more of the modules described above. Can do. The components within the encoding module 30 may include one or more processors, digital signal processors, application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, software, hardware, firmware, or any of these. It can be implemented as a combination. Drawing different features as modules is intended to emphasize different functional aspects of the encoding module 30 and that the modules must be realized by separate hardware or software components. Not necessarily. Rather, functionality associated with one or more modules may be incorporated within common or separate hardware or software components.

図３は、他の典型的符号化モジュール５０を示すブロック図である。図３の符号化モジュール５０は、図２の符号化モジュール３０に実質的に準じるが、符号化モジュール５０のビット推定モジュール５２は、残差データに関する変換係数の量子化後に残差データをコーディングすることと関連づけられたビット数を推定する。特に、変換係数の量子化後は、ビット推定モジュール５２は、以下の方程式を用いて残差係数をコーディングすることと関連づけられたビット数を推定する。

FIG. 3 is a block diagram illustrating another exemplary encoding module 50. The encoding module 50 of FIG. 3 substantially conforms to the encoding module 30 of FIG. 2, but the bit estimation module 52 of the encoding module 50 codes the residual data after quantizing the transform coefficients for the residual data. Estimate the number of bits associated with In particular, after quantizing the transform coefficients, the bit estimation module 52 estimates the number of bits associated with coding the residual coefficients using the following equation:

ここで、ＳＡＴＱＤは、ゼロでない量子化された変換係数の絶対値の和であり、ＮＺ_ＴＱは、ゼロでない量子化された変換係数の数であり、ａ_１、ａ_２、及びａ_３は、係数である。係数ａ_１、ａ_２、及びａ_３は、例えば、最小二乗推定を用いて計算することができる。符号化モジュール５０は、残差データをコーディングすることと関連づけられたビット数を推定する前に変換係数を量子化するが、符号化モジュール５０は、依然として、ブロックのデータを実際にコーディングせずにモードに関するコーディングコストを推定する。従って、計算集約型の計算量が依然として低減される。 Where SATQD is the sum of absolute values of non-zero quantized transform coefficients, NZ _TQ is the number of non-zero quantized transform coefficients, and a ₁ , a ₂ , and a ₃ are It is a coefficient. The coefficients a ₁ , a ₂ , and a ₃ can be calculated using, for example, least squares estimation. Encoding module 50 quantizes the transform coefficients before estimating the number of bits associated with coding the residual data, but encoding module 50 still does not actually code the block data. Estimate the coding cost for the mode. Therefore, the calculation amount of the calculation intensive type is still reduced.

図４は、少なくとも推定されたコーディングコストに基づいて符号化モードを選択する符号化モジュール、例えば図２の符号化モジュール３０及び／又は図３の符号化モジュール５０、の典型的動作を示す流れ図である。しかしながら、典型例を示すことを目的として、図４は、符号化モジュール３０に関して説明される。符号化モジュール３０は、コーディングコストを推定する対象となるモードを選択する（６０）。符号化モジュール３０は、現在のブロックに関する歪みメトリックを生成する（６２）。符号化モジュール３０は、例えば、ブロックと少なくとも１つの基準ブロックとの間の比較に基づいて歪みメトリックを計算することができる。フレーム内コーディング対象として選択されたブロックの場合は、基準ブロックは、同じフレーム内の隣接ブロックであることができる。他方、フレーム間コーディングを対象として選択されたブロックの場合は、基準ブロックは、隣接フレームからのブロックであることができる。歪みメトリックは、例えば、ＳＡＤ、ＳＳＤ、ＳＡＴＤ、ＳＳＴＤ、又はその他の同様の歪みメトリックであることができる。 FIG. 4 is a flow diagram illustrating exemplary operation of an encoding module that selects an encoding mode based at least on the estimated coding cost, eg, encoding module 30 of FIG. 2 and / or encoding module 50 of FIG. is there. However, for purposes of illustrating an example, FIG. 4 will be described with respect to encoding module 30. The encoding module 30 selects a mode for which a coding cost is to be estimated (60). Encoding module 30 generates a distortion metric for the current block (62). Encoding module 30 may calculate a distortion metric based on, for example, a comparison between the block and at least one reference block. In the case of a block selected for intra-frame coding, the reference block can be an adjacent block in the same frame. On the other hand, for a block selected for interframe coding, the reference block may be a block from an adjacent frame. The distortion metric can be, for example, SAD, SSD, SATD, SSTD, or other similar distortion metric.

図４の例おいては、符号化モジュール３０は、非コンテキスト適応型コーディング技法を用いてコーディングされるデータ部分をコーディングすることと関連づけられたビット数を決定する（６４）。上述されるように、このデータは、ブロックの１つ以上の動きベクトルと、ブロックのコーディングモードを示す識別子と、１つ以上の基準フレームインデックスと、ＱＰ情報と、ブロックのスライス情報と、等を含むことができる。符号化モジュール３０は、例えば、ＦＬＣ、ユニバーサルＶＬＣ又はその他の非コンテキスト適応型コーディング技法を用いてデータをコーディングすることと関連づけられたビット数を識別するための符号テーブルを用いることができる。 In the example of FIG. 4, encoding module 30 determines a number of bits associated with coding a portion of data that is coded using a non-context adaptive coding technique (64). As described above, this data includes one or more motion vectors of the block, an identifier indicating the coding mode of the block, one or more reference frame indexes, QP information, slice information of the block, etc. Can be included. Encoding module 30 may use a code table to identify the number of bits associated with coding data using, for example, FLC, universal VLC, or other non-context adaptive coding techniques.

符号化モジュール３０は、コンテキスト適応型コーディング技法を用いてコーディングされるデータ部分をコーディングすることと関連づけられたビット数を推定及び／又は計算する（６６）。例えばＨ．２６４基準に関しては、符号化モジュール３０は、コンテキスト適応型コーディングを用いて残差データをコーディングすることと関連づけられたビット数を推定することができる。符号化モジュール３０は、残差データをコーディングすることを実際に行わずに残差データをコーディングすることと関連づけられたビット数を推定することができる。一定の側面においては、符号化モジュール３０は、残差データを量子化せずに残差データをコーディングすることと関連づけられたビット数を推定することができる。例えば、符号化モジュール３０は、残差データに関する変換係数を計算すること及び量子化後にゼロでないままである見込みである変換係数を識別することができる。これらの識別された変換係数を用いて、符号化モジュール３０は、残差データをコーディングすることと関連づけられたビット数を推定する。その他の側面においては、符号化モジュール３０は、変換係数を量子化すること及び少なくとも量子化された変換係数に基づいて残差データをコーディングすることと関連づけられたビット数を推定することができる。いずれの場合も、符号化モジュール３０は、要求されるビット数を推定することによって時間及び処理資源を節約する。十分な計算資源が存在する場合は、符号化モジュール３０は、推定する代わりに要求される実際のビット数を計算することができる。 Encoding module 30 estimates and / or calculates the number of bits associated with coding the portion of data to be coded using context adaptive coding techniques (66). For example, H.C. For the H.264 standard, encoding module 30 may estimate the number of bits associated with coding the residual data using context adaptive coding. Encoding module 30 may estimate the number of bits associated with coding the residual data without actually coding the residual data. In certain aspects, the encoding module 30 may estimate the number of bits associated with coding the residual data without quantizing the residual data. For example, encoding module 30 may calculate transform coefficients for residual data and identify transform coefficients that are likely to remain non-zero after quantization. Using these identified transform coefficients, encoding module 30 estimates the number of bits associated with coding the residual data. In other aspects, the encoding module 30 may estimate the number of bits associated with quantizing the transform coefficients and coding the residual data based at least on the quantized transform coefficients. In either case, encoding module 30 saves time and processing resources by estimating the number of bits required. If there are sufficient computational resources, the encoding module 30 can calculate the actual number of bits required instead of estimating.

符号化モジュール３０は、選択されたモードにおいてブロックをコーディングすることに関する総コーディングコストを推定及び／又は計算する（６８）。符号化モジュール３０は、歪みメトリック、非コンテキスト適応型コーディングを用いてコーディングされるデータ部分をコーディングすることと関連づけられたビット及びコンテキスト適応型コーディングを用いてコーディングされるデータ部分をコーディングすることと関連づけられたビットに基づいてブロックをコーディングすることに関する総コーディングコストを推定することができる。例えば、符号化モジュール３０は、上記の方程式（２）又は（３）を用いて選択されたモードにおいてブロックをコーディングすることに関する総コーディングコストを推定することができる。 Encoding module 30 estimates and / or calculates a total coding cost for coding the block in the selected mode (68). Encoding module 30 is associated with coding a data portion that is coded using distortion metrics, bits associated with coding a data portion that is coded using non-context adaptive coding, and context adaptive coding. A total coding cost for coding the block based on the given bits can be estimated. For example, encoding module 30 may estimate the total coding cost for coding a block in the mode selected using equations (2) or (3) above.

符号化モジュール３０は、コーディングコストを推定する対象となるその他のコーディングモードが存在するかどうかを決定する（７０）。上述されるように、符号化モジュール３０は、可能なモードの少なくとも一部に関するコーディングコストを推定する。一定の側面においては、符号化モジュール３０は、可能なコーディングモードの各々においてブロックをコーディングするコストを推定することができる。例えばＨ．２６４基準においては、符号化モジュール３０は、フレーム間コーディング用に選択されたブロックに関しては２２の異なるコーディングモード（フレーム間及びフレーム内コーディングモード）及びフレーム内コーディング用に選択されたブロックに関しては１３の異なるコーディングモードに関するコーディングコストを推定することができる。その他の側面においては、符号化モジュール３０は、最初に可能なモードの組を縮小するために他のモード選択技法を用いることができ、及び縮小されたコーディングモードの組に関するコーディングコストを推定するためにこの開示の技法を利用することができる。 Encoding module 30 determines whether there are other coding modes for which coding costs are to be estimated (70). As described above, the encoding module 30 estimates the coding cost for at least some of the possible modes. In certain aspects, encoding module 30 may estimate the cost of coding a block in each possible coding mode. For example, H.C. In the H.264 standard, the encoding module 30 determines that there are 22 different coding modes (interframe and intraframe coding modes) for blocks selected for interframe coding and 13 for blocks selected for intraframe coding. Coding costs for different coding modes can be estimated. In other aspects, encoding module 30 may use other mode selection techniques to reduce the first possible mode set and estimate the coding cost for the reduced set of coding modes. The techniques of this disclosure can be used.

コーディングコストを推定する対象となるさらなるコーディングモードが存在するときには、符号化モジュール３０は、次のコーディングモードを選択し、選択されたコーディングモードにおいてデータをコーディングするコストを推定する。コーディングコストを推定する対象となるさらなるコーディングモードが存在しないときには、符号化モジュール３０は、少なくとも推定されたコーディングコストに基づいてブロックをコーディングするために用いるモードのうちの１つを選択する（７２）。一例においては、コーディングモジュール３０は、最低の推定されたコーディングコストを有するコーディングモードを選択することができる。モードが選択された時点で、コーディングモジュール３０は、選択されたモードを適用して特定のブロックをコーディングすることができる（７４）。プロセスは、所定のフレーム内の追加のブロックに関して続くことができる。一例として、プロセスは、フレーム内の全ブロックがここにおいて説明される技法により選択されたコーディングモードを用いてコーディングされてしまうまで続くことができる。さらに、プロセスは、複数のフレームのブロックが高効率モードを用いてコーディングされるまで続くことができる。 When there are additional coding modes for which coding costs are to be estimated, encoding module 30 selects the next coding mode and estimates the cost of coding data in the selected coding mode. When there are no additional coding modes for which coding costs are to be estimated, encoding module 30 selects one of the modes used to code the block based at least on the estimated coding costs (72). . In one example, coding module 30 may select the coding mode that has the lowest estimated coding cost. Once the mode is selected, coding module 30 may apply the selected mode to code a particular block (74). The process can continue for additional blocks within a given frame. As an example, the process can continue until all blocks in the frame have been coded using the coding mode selected by the techniques described herein. Furthermore, the process can continue until blocks of multiple frames are coded using the high efficiency mode.

図５は、ブロックの残差係数をコーディングすることと関連づけられたビット数を推定する符号化モジュール、例えば図２の符号化モジュール３０、の典型的動作を示す流れ図である。コーディングコストを推定する対象となるコーディングモードのうちの１つを選択後、符号化モジュール３０は、選択されたモードに関するブロックの残差データを生成する（８０）。例えばフレーム内コーディングするために選択されたブロックの場合は、空間予測モジュール３４は、ブロックをそのブロックの予測されたバージョンと比較することに基づいてそのブロックに関する残差データを生成する。代替として、フレーム間コーディングするために選択されたブロックの場合は、動き推定モジュール３６及び動き補償モジュール３８は、ブロックと基準フレーム内の対応ブロックとの比較に基づいてそのブロックに関する残差データを計算する。幾つかの側面においては、残差データは、ブロックの歪みメトリックを生成するために計算済みであることができる。この場合は、符号化モジュール３０は、残差データをメモリから取り出すことができる。 FIG. 5 is a flow diagram illustrating exemplary operation of an encoding module that estimates the number of bits associated with coding the residual coefficients of the block, eg, encoding module 30 of FIG. After selecting one of the coding modes for which the coding cost is to be estimated, the encoding module 30 generates block residual data for the selected mode (80). For example, for a block selected for intra-frame coding, spatial prediction module 34 generates residual data for that block based on comparing the block to a predicted version of the block. Alternatively, for a block selected for interframe coding, motion estimation module 36 and motion compensation module 38 calculate residual data for that block based on a comparison of the block with the corresponding block in the reference frame. To do. In some aspects, the residual data may have been calculated to generate a block distortion metric. In this case, the encoding module 30 can retrieve the residual data from the memory.

変換モジュール４０は、変換関数に従ってブロックの残差係数を変換して残差データに関する変換係数を生成する（８２）。変換モジュール４０は、例えば、４×４又は８×８整数変換又はＤＣＴ変換を残差データに適用して残差データに関する変換係数を生成する。ビット推定モジュール４２は、変換係数のうちの１つを対応するしきい値と比較して変換係数がしきい値よりも大きい又は同じであるかどうかを決定する（８４）。変換係数に対応するしきい値は、符号化モジュール３０のＱＰの関数として計算することができる。変換係数が対応するしきい値よりも大きいか又は同じである場合は、ビット推定モジュール４２は、その変換係数を、量子化後にゼロでないままである係数であるとして識別する（８６）。変換係数が対応するしきい値よりも小さい場合は、ビット推定モジュール４２は、変換係数を、量子化後にゼロになる係数として識別する（８８）。 The transform module 40 transforms the block residual coefficients according to the transform function to generate transform coefficients for the residual data (82). The transform module 40 applies, for example, 4 × 4 or 8 × 8 integer transform or DCT transform to the residual data to generate transform coefficients for the residual data. Bit estimation module 42 compares one of the transform coefficients with a corresponding threshold value to determine if the transform coefficient is greater than or equal to the threshold value (84). The threshold corresponding to the transform coefficient can be calculated as a function of the QP of the encoding module 30. If the transform coefficient is greater than or equal to the corresponding threshold, bit estimation module 42 identifies the transform coefficient as a coefficient that remains non-zero after quantization (86). If the transform coefficient is less than the corresponding threshold, bit estimation module 42 identifies the transform coefficient as a coefficient that becomes zero after quantization (88).

ビット推定モジュール４２は、ブロックの残差データに関する追加の変換係数が存在するかどうかを決定する（９０）。ブロックの追加の変換係数が存在する場合は、ビット推定モジュール４２は、係数のうちの他の１つを選択してそれを対応するしきい値と比較する。解析すべき追加の変換係数が存在しない場合は、ビット推定モジュール４２は、量子化後にゼロでないままであるとして識別された係数の数を決定する（９２）。ビット推定モジュール４２は、量子化後にゼロでないままであるとして識別された変換係数の絶対値の少なくとも一部の絶対値も合計する（９４）。ビット推定モジュール４２は、決定されたゼロでない係数の数及びゼロでない係数の一部の和を用いて残差データをコーディングすることと関連づけられたビット数を推定する（９６）。ビット推定モジュール４２は、例えば、上記の方程式（４）を用いて残差データをコーディングすることと関連づけられたビット数を推定することができる。この方法により、符号化モジュール３０は、ブロックの残差データを量子化又は符号化せずに選択されたモードにおいて残差データをコーディングすることと関連づけられたビット数を推定する。 Bit estimation module 42 determines whether there are additional transform coefficients for the residual data of the block (90). If there are additional transform coefficients for the block, bit estimation module 42 selects the other one of the coefficients and compares it to the corresponding threshold. If there are no additional transform coefficients to analyze, bit estimation module 42 determines the number of coefficients identified as remaining non-zero after quantization (92). Bit estimation module 42 also sums (94) the absolute values of at least some of the absolute values of the transform coefficients identified as remaining non-zero after quantization. Bit estimation module 42 estimates the number of bits associated with coding the residual data using the determined number of non-zero coefficients and the sum of the portions of the non-zero coefficients (96). Bit estimation module 42 may estimate the number of bits associated with coding the residual data using, for example, equation (4) above. In this manner, encoding module 30 estimates the number of bits associated with coding the residual data in the selected mode without quantizing or encoding the residual data of the block.

図６は、ブロックの残差係数をコーディングすることと関連づけられたビット数を推定する符号化モジュール、例えば図３の符号化モジュール５０、の典型的動作を示す流れ図である。コーディングコストを推定する対象となるコーディングモードのうちの１つを選択後は、符号化モジュール５０は、ブロックの残差係数を生成する（１００）。例えばフレーム内コーディングするために選択されたブロックの場合は、空間予測モジュール３４は、ブロックをそのブロックの予測されたバージョンと比較することに基づいてそのブロックに関する残差データを計算する。代替として、フレーム間コーディングするために選択されたブロックの場合は、動き推定モジュール３６及び動き補償モジュール３８は、ブロックと基準フレーム内の対応ブロックとの比較に基づいてそのブロックに関する残差データを計算する。幾つかの側面においては、残差係数は、ブロックの歪みメトリックを生成するために計算済みであることができる。 FIG. 6 is a flow diagram illustrating exemplary operation of an encoding module that estimates the number of bits associated with coding the residual coefficients of the block, eg, encoding module 50 of FIG. After selecting one of the coding modes for which coding cost is to be estimated, the encoding module 50 generates a residual coefficient for the block (100). For example, for a block selected for intra-frame coding, the spatial prediction module 34 calculates residual data for the block based on comparing the block to a predicted version of the block. Alternatively, for a block selected for interframe coding, motion estimation module 36 and motion compensation module 38 calculate residual data for that block based on a comparison of the block with the corresponding block in the reference frame. To do. In some aspects, the residual coefficients may have been calculated to generate a block distortion metric.

変換モジュール４０は、変換関数に従ってブロックの残差係数を変換して残差データに関する変換係数を生成する（１０２）。変換モジュール４０は、例えば、４×４又は８×８整数変換又はＤＣＴ変換を残差データに適用して変換された残差係数を生成することができる。量子化モジュール４６は、符号化モジュール５０のＱＰに従って変換係数を量子化する（１０４）。 The transform module 40 transforms the block residual coefficients according to the transform function to generate transform coefficients for the residual data (102). The transform module 40 may generate a transformed residual coefficient by applying a 4 × 4 or 8 × 8 integer transform or DCT transform to the residual data, for example. The quantization module 46 quantizes the transform coefficient according to the QP of the encoding module 50 (104).

ビット推定モジュール５２は、ゼロでない量子化された変換係数の数を決定する（１０６）。ビット推定モジュール４２は、非ゼロレベル又は量子化された変換係数の絶対値も合計する（１０８）。ビット推定モジュール５２は、ゼロでない量子化された変換係数の計算された数及びゼロでない量子化された変換係数の和を用いて残差データをコーディングすることと関連づけられたビット数を推定する（１１０）。ビット推定モジュール５２は、例えば、上記の方程式（４）を用いて残差係数をコーディングすることと関連づけられたビット数を推定することができる。この方法により、符号化モジュールは、残差データを符号化せずに選択されたモードにおいてブロックの残差データをコーディングすることと関連づけられたビット数を推定する。 Bit estimation module 52 determines the number of non-zero quantized transform coefficients (106). Bit estimation module 42 also sums the absolute values of the non-zero level or quantized transform coefficients (108). Bit estimation module 52 estimates the number of bits associated with coding the residual data using the calculated number of non-zero quantized transform coefficients and the sum of the non-zero quantized transform coefficients ( 110). Bit estimation module 52 may estimate the number of bits associated with coding the residual coefficient using, for example, equation (4) above. In this manner, the encoding module estimates the number of bits associated with coding the block residual data in the selected mode without encoding the residual data.

ここにおいて説明される教示に基づき、ここにおいて開示される側面は、その他の側面とは無関係に実装できること及びこれらの側面のうちの２つ以上を様々な方法で組み合わせることができることが明らかなはずである。ここにおいて説明される技法は、ハードウェア内、ソフトウェア内、ファームウェア内、又はそのいずれかの組み合わせ内において実装することができる。ハードウェア内に実装される場合は、これらの技法は、デジタルハードウェア、アナログハードウェア又はその組み合わせを用いて実現することができる。ソフトウェア内に実装される場合は、これらの技法は、命令又は符号が格納されているコンピュータによって読み取り可能な媒体を含むコンピュータプログラム製品によって少なくとも部分的に実現することができる。コンピュータプログラム製品のコンピュータによって読み取り可能な媒体と関連づけられた命令又は符号は、コンピュータによって、例えば１つ以上のプロセッサ、例えば１つ以上のデジタル信号プロセッサ（ＤＳＰ）、汎用マイクロプロセッサ、ＡＳＩＣ、ＦＰＧＡ、又はその他の同等の集積回路又は個別論理回路、によって実行することができる。 Based on the teachings described herein, it should be apparent that the aspects disclosed herein can be implemented independently of other aspects and that two or more of these aspects can be combined in various ways. is there. The techniques described herein may be implemented in hardware, software, firmware, or any combination thereof. If implemented in hardware, these techniques can be implemented using digital hardware, analog hardware, or a combination thereof. If implemented in software, these techniques may be implemented at least in part by a computer program product that includes a computer-readable medium having instructions or codes stored thereon. The instructions or symbols associated with a computer readable medium of a computer program product may be transmitted by a computer, for example, one or more processors, such as one or more digital signal processors (DSPs), general purpose microprocessors, ASICs, FPGAs, It can be implemented by other equivalent integrated circuits or individual logic circuits.

一例として、ただし制限することなしに、該コンピュータによって読み取り可能な媒体は、ＲＡＭ、例えば同期ダイナミックランダムアクセスメモリ（ＳＤＲＡＭ）、読み取り専用メモリ（ＲＯＭ）、非揮発性ランダムアクセスメモリ（ＮＶＲＡＭ）、ＲＯＭ、電気的消去可能プログラマブル読み取り専用メモリ（ＥＥＰＲＯＭ）、ＥＥＰＲＯＭ、ＦＬＡＳＨメモリ、ＣＤ−ＲＯＭ、又はその他の光学ディスク記憶装置、磁気ディスク記憶装置又はその他の磁気記憶装置、又は希望されるプログラムコードを命令又は命令構造の形態で搬送又は格納するために用いることができ及びコンピュータによってアクセス可能であるその他のあらゆる有形な媒体を備えることができる。 By way of example and not limitation, the computer readable medium may be RAM, such as synchronous dynamic random access memory (SDRAM), read only memory (ROM), non-volatile random access memory (NVRAM), ROM, Instructions or instructions for electrically erasable programmable read-only memory (EEPROM), EEPROM, FLASH memory, CD-ROM, or other optical disk storage, magnetic disk storage or other magnetic storage, or desired program code Any other tangible medium that can be used for transporting or storing in the form of a structure and accessible by a computer can be provided.

幾つかの側面及び例が説明されている。しかしながら、これらの例の様々な修正が可能であり、さらに、ここにおいて提示される原理は、その他の側面に対しても同様に適用することができる。これらの及びその他の側面は、以下の請求項の適用範囲内である。
以下に、本願出願の当初の特許請求の範囲に記載された発明を付記する。
［Ｃ１］
デジタル映像データを処理するための方法であって、
量子化されたときにゼロでないままである画素ブロックの残差データに関する１つ以上の変換係数を識別することと、
少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定することと、
前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定すること、とを備える、方法。
［Ｃ２］
前記変換係数を識別することは、前記変換係数の各々を複数のしきい値のうちの対応する１つと比較して量子化されたときにゼロでないままである前記変換係数を識別することを備え、前記複数のしきい値の各々は、量子化パラメータ（ＱＰ）の関数として計算されるＣ１に記載の方法。
［Ｃ３］
前記変換係数の各々を複数のしきい値のうちの対応する１つと比較して量子化されたときにゼロでないままである前記変換係数を識別することは、対応するしきい値よりも小さい前記変換係数を、量子化されたときにゼロでないままである変換係数として識別することを備えるＣ２に記載の方法。
［Ｃ４］
複数の組のしきい値を予め計算することであって、前記しきい値の組の各々は、前記ＱＰの異なる値に対応することと、
前記画素ブロックを符号化するために用いられる前記ＱＰの前記値に基づいて前記複数のしきい値の組のうちの１つを選択すること、とをさらに備えるＣ２に記載の方法。
［Ｃ５］
前記残差データをコーディングすることと関連づけられた前記ビット数を推定することは、
量子化されたときにゼロでないままであるとして識別された前記変換係数の数を決定することと、
量子化されたときにゼロでないままであるとして識別された前記変換係数のうちの少なくとも１つの絶対値を合計することと、
ゼロでない変換係数の少なくとも前記決定された数及び前記少なくとも１つのゼロでない変換係数の前記絶対値の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定すること、とを備えるＣ１に記載の方法。
［Ｃ６］
前記残差データのコーディングと関連づけられた前記ビット数を推定することは、少なくとも２つのブロックモードの各々において前記残差データをコーディングするために要求されるビット数を推定することを備え、前記コーディングコストを推定することは、前記少なくとも２つのブロックモードの各々において、前記ブロックモードのうちの前記各々の１つにおける少なくとも前記推定されたビット数に基づいて前記コーディングコストを推定することを備え、前記モードの各々に関して少なくとも前記推定されたコーディングコストに基づいて前記ブロックモードのうちの１つを選択することをさらに備えるＣ１に記載の方法。
［Ｃ７］
前記モードの各々に関して、前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数を用いて前記画素ブロックをコーディングするための総コーディングコストを推定することと、
前記複数のモードのうちで最低の推定された総コーディングコストを有するモードを選択することと、
前記選択されたモードを適用して前記画素ブロックをコーディングすること、とをさらに備えるＣ６に記載の方法。
［Ｃ８］
前記総コーディングコストを推定することは、
前記画素ブロックに関する歪みメトリックを計算することと、
前記画素ブロックの非残差データのコーディングと関連づけられたビット数を計算することと、
少なくとも前記歪みメトリック、前記非残差データのコーディングと関連づけられた前記ビット数、及び前記残差データのコーディングと関連づけられた前記ビット数に基づいて前記画素ブロックをコーディングするための前記総コーディングコストを推定すること、とを備えるＣ７に記載の方法。
［Ｃ９］
前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数に基づいてコーディングコーディングを選択することと、
前記コーディングモードを選択後に前記残差データに関する前記変換係数を量子化することと、
前記残差データに関する前記量子化された変換係数を符号化することと、
前記残差データに関する前記符号化された係数を送信すること、とをさらに備えるＣ１に記載の方法。
［Ｃ１０］
前記変換係数の行列を生成することであって、前記変換係数行列の行数は、前記ブロック内における画素の行数と等しく、前記変換係数行列の列数は、前記ブロック内における画素の列数と等しいことと、
前記変換係数行列をしきい値行列と比較することであって、前記しきい値行列は、前記変換係数行列の次元と同じ次元を有し、前記比較は、１及びゼロの行列が得られ、前記ゼロは、量子化後にゼロになる前記変換係数行列内の位置を表し、前記１は、量子化後にゼロでないままである前記変換係数行列内の位置を表すことと、
前記１及びゼロの行列内における１の数を合計して量子化時にゼロでないままであるとして識別された前記変換係数の数を計算することと、
前記１及びゼロの行列内の前記１の位置に対応する前記変換係数行列内の前記変換係数のうちの少なくとも１つの絶対値を合計することと、
少なくとも前記ゼロでない変換係数の数及び前記少なくとも１つのゼロでない変換係数の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定すること、とをさらに備えるＣ１に記載の方法。
［Ｃ１１］
デジタル映像データを処理するための装置であって、
画素ブロックの残差データに関する変換係数を生成する変換モジュールと、
量子化されたときにゼロでないままである前記変換係数のうちの１つ以上を識別し及び少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定するビット推定モジュールと、
前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定する制御モジュールと、を備える、装置。
［Ｃ１２］
前記ビット推定モジュールは、前記変換係数の各々を複数のしきい値のうちの対応する１つと比較して量子化されたときにゼロでないままである変換係数を識別し、前記複数のしきい値の各々は、量子化パラメータ（ＱＰ）の関数として計算されるＣ１１に記載の装置。
［Ｃ１３］
前記ビット推定モジュールは、対応するしきい値よりも小さい前記変換係数を、量子化されたときにゼロでないままである変換係数として識別するＣ１２に記載の装置。
［Ｃ１４］
前記ビット推定モジュールは、複数の組のしきい値を予め計算し、前記しきい値の組の各々は、前記ＱＰの異なる値に対応し、前記しきい値の組の各々は、前記ＱＰの異なる値に対応し、前記画素ブロックを符号化するために用いられる前記ＱＰの前記値に基づいて前記複数のしきい値の組のうちの１つを選択するＣ１２に記載の装置。
［Ｃ１５］
前記ビット推定モジュールは、量子化されたときにゼロでないままであるとして識別された前記変換係数の数を決定し、量子化されたときにゼロでないままであるとして識別された前記変換係数のうちの少なくとも１つの絶対値を合計し及びゼロでない変換係数の少なくとも前記決定された数及び前記少なくとも１つのゼロでない変換係数の前記絶対値の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定するＣ１１に記載の装置。
［Ｃ１６］
前記ビット推定モジュールは、少なくとも２つのブロックモードの各々における前記残差データのコーディングと関連づけられた前記ビット数を推定し、
前記制御モジュールは、前記少なくとも２つのブロックモードのうちの各々の１つにおける少なくとも前記推定されたビット数に基づいて前記ブロックの各々に関するコーディングコストを推定し、及び前記モードの各々に関して少なくとも前記推定されたコーディングコストに基づいて前記ブロックモードのうちの１つを選択するＣ１１に記載の装置。
［Ｃ１７］
前記制御モジュールは、前記モードの各々に関して、前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数を用いて前記画素ブロックをコーディングするための総コーディングコストを推定し、前記複数のモードのうちで最低の推定された総コーディングコストを有するモードを選択し、及び前記選択されたモードを適用して前記画素ブロックをコーディングするＣ１６に記載の装置。
［Ｃ１８］
前記制御モジュールは、前記画素ブロックに関する歪みメトリックを計算し、前記画素ブロックの非残差データのコーディングと関連づけられたビット数を計算し及び少なくとも前記歪みメトリック、前記非残差データのコーディングと関連づけられたビット数及び前記残差データのコーディングと関連づけられた前記ビット数に基づいて前記画素ブロックをコーディングするための前記総コーディングコストを推定するＣ１７に記載の装置。
［Ｃ１９］
前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいてコーディングモードを選択する制御モジュールと、
前記コーディングモードの選択後に前記残差データに関する前記変換係数を量子化する量子化モジュールと、
前記残差データに関する前記量子化された変換係数を符号化するエントロピー符号化モジュールと、
前記残差データに関する前記符号化された係数を送信する送信機と、をさらに備えるＣ１１に記載の装置。
［Ｃ２０］
前記変換モジュールは、前記変換係数の行列を生成し、前記変換係数行列の行数は、前記ブロック内における画素の行数と等しく、前記変換係数行列の列数は、前記ブロック内における画素の列数と等しく、
前記ビット推定モジュールは、前記変換係数行列をしきい値行列と比較し、前記しきい値行列は、前記変換係数行列の次元と同じ次元を有し、前記比較は、１及びゼロの行列が得られ、前記ゼロは、量子化後にゼロになる前記変換係数行列内の位置を表し、前記１は、量子化後にゼロでないままである前記変換係数行列内の位置を表し、
前記ビット推定モジュールは、前記１及びゼロの行列内における１の数を合計して量子化されたときにゼロでないままであるとして識別された前記変換係数の数を計算し、前記１及びゼロの行列内の前記１の位置に対応する前記変換係数行列内の前記変換係数のうちの少なくとも１つの絶対値を合計し、及び少なくとも前記ゼロでない変換係数の数及び前記少なくとも１つのゼロでない変換係数の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定するＣ１１に記載の装置。
［Ｃ２１］
デジタル映像データを処理するための装置であって、
量子化されたときにゼロでないままである画素ブロックの残差データに関する１つ以上の変換係数を識別するための手段と、
少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定するための手段と、
前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定するための手段と、を備える、装置。
［Ｃ２２］
前記識別する手段は、前記変換係数の各々を複数のしきい値のうちの対応する１つと比較して量子化されたときにゼロでないままである変換係数を識別し、前記複数のしきい値の各々は、量子化パラメータ（ＱＰ）の関数として計算されるＣ２１に記載の装置。
［Ｃ２３］
前記識別する手段は、対応するしきい値よりも小さい前記変換係数を、量子化されたときにゼロでないままである変換係数として識別するＣ２２に記載の装置。
［Ｃ２４］
複数の組のしきい値を予め計算するための手段であって、前記しきい値の組の各々は、前記ＱＰの異なる値に対応する手段と、
前記画素ブロックを符号化するために用いられる前記ＱＰの前記値に基づいて前記複数のしきい値の組のうちの１つを選択するための手段と、をさらに備えるＣ２２に記載の装置。
［Ｃ２５］
前記推定する手段は、量子化されたときにゼロでないままであるとして識別された前記変換係数の数を決定し、量子化されたときにゼロでないままであるとして識別された前記変換係数のうちの少なくとも１つの絶対値を合計し、及びゼロでない変換係数の少なくとも前記決定された数及び前記少なくとも１つのゼロでない変換係数の前記絶対値の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定するＣ２１に記載の装置。
［Ｃ２６］
前記ビット推定手段は、少なくとも２つのブロックモードの各々における前記残差データのコーディングと関連づけられたビット数を推定し、及び前記コーディングコスト推定手段は、前記少なくとも２つのブロックモードのうちの各々の１つにおける少なくとも前記推定されたビット数に基づいて前記ブロックモードの各々に関するコーディングコストを推定し、及び前記ブロックモードの各々に関して少なくとも前記推定されたビット数に基づいて前記ブロックモードのうちの１つを選択するための手段をさらに備えるＣ２１に記載の装置。
［Ｃ２７］
前記モードの各々に関して、前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数を用いて前記画素ブロックをコーディングするための総コーディングコストを推定するための手段をさらに備え、前記選択する手段は、前記複数のモードのうちで最低の推定された総コーディングコストを有するモードを選択するＣ２６に記載の装置。
［Ｃ２８］
前記コーディングコスト推定手段は、前記画素ブロックに関する歪みメトリックを計算し、前記画素ブロックの非残差データのコーディングと関連づけられたビット数を計算し、及び少なくとも前記歪みメトリック、前記非残差データのコーディングと関連づけられた前記ビット数及び前記残差データのコーディングと関連づけられた前記ビット数に基づいて前記画素ブロックをコーディングするための前記総コーディングコストを推定するＣ２７に記載の装置。
［Ｃ２９］
前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数に基づいてコーディングモードを選択するための手段と、
前記コーディングモードを選択後に前記残差データに関する前記変換係数を量子化するための手段と、
前記残差データに関する前記量子化された変換係数を符号化するための手段と、
前記残差データに関する前記符号化された係数を送信するための手段と、をさらに備えるＣ２１に記載の装置。
［Ｃ３０］
前記変換係数の行列を生成するための手段をさらに備え、前記変換係数行列の行数は、前記ブロック内における画素の行数と等しく、前記変換係数行列の列数は、前記ブロック内における画素の列数と等しく、
前記識別する手段は、前記変換係数行列をしきい値行列と比較し、前記しきい値行列は、前記変換係数行列の次元と同じ次元を有し、前記比較は、１及びゼロの行列が得られ、前記ゼロは、量子化後にゼロになる前記変換係数行列内の位置を表し、前記１は、量子化後にゼロでないままである前記変換係数行列内の位置を表し、
前記推定する手段は、前記１及びゼロの行列内における１の数を合計して量子化されたときにゼロでないままであるとして識別された前記変換係数の数を計算し、前記１及びゼロの行列内の前記１の位置に対応する前記変換係数行列内の前記変換係数のうちの少なくとも１つの絶対値を合計し、及び少なくとも前記ゼロでない変換係数の数及び前記少なくとも１つのゼロでない変換係数の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定するＣ２１に記載の装置。
［Ｃ３１］
命令が格納されているコンピュータによって読み取り可能な媒体を備える、デジタル映像データを処理するためのコンピュータプログラム製品であって、前記命令は、
量子化されたときにゼロでないままである画素ブロックの残差データに関する１つ以上の変換係数を識別するための符号と、
少なくとも前記識別された変換係数に基づいて前記残差データのコーディングと関連づけられたビット数を推定するための符号と、
前記残差データをコーディングすることと関連づけられた少なくとも前記推定されたビット数に基づいて前記画素ブロックをコーディングするためのコーディングコストを推定するための符号と、を備える、コンピュータプログラム製品。
［Ｃ３２］
前記変換係数を識別するための符号は、前記変換係数の各々を複数のしきい値のうちの対応する１つと比較して量子化されたときにゼロでないままである変換係数を識別し、前記複数のしきい値の各々は、量子化パラメータ（ＱＰ）の関数として計算されるＣ３１に記載のコンピュータプログラム製品。
［Ｃ３３］
前記変換係数の各々を複数のしきい値のうちの対応する１つと比較して量子化されたときにゼロでないままである変換係数を識別するための符号は、対応するしきい値よりも小さい前記変換係数を、量子化されたときにゼロでないままである変換係数として識別するための符号を備えるＣ３２に記載のコンピュータプログラム製品。
［Ｃ３４］
複数の組のしきい値を予め計算するための符号であって、前記しきい値の組の各々は、前記ＱＰの異なる値に対応する符号と、
前記画素ブロックを符号化するために用いられる前記ＱＰの前記値に基づいて前記複数のしきい値の組のうちの１つを選択するための符号と、をさらに備えるＣ３２に記載のコンピュータプログラム製品。
［Ｃ３５］
前記残差データのコーディングと関連づけられた前記ビット数を推定するための符号は、
量子化されたときにゼロでないままであるとして識別された前記変換係数の数を決定するための符号と、
量子化されたときにゼロでないままであるとして識別された前記変換係数のうちの少なくとも１つの絶対値を合計するための符号と、
ゼロでない変換係数の少なくとも前記決定された数及び前記少なくとも１つのゼロでない変換係数の前記絶対値の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定するための符号と、を備えるＣ３１に記載のコンピュータプログラム製品。
［Ｃ３６］
前記残差データのコーディングと関連づけられた前記ビット数を推定するための符号は、少なくとも２つのブロックモードのうちの各々における前記残差データのコーディングと関連づけられたビット数を推定するための符号を備え、及び前記コーディングコストを推定するための符号は、前記ブロックモードのうちの各々の１つにおける少なくとも前記推定されたビット数に基づいて前記少なくとも２つのブロックノードの各々に関する前記コーディングコストを推定するための符号を備え、及び前記ブロックモードの各々に関して少なくとも前記推定されたビット数に基づいて前記ブロックモードのうちの１つを選択するための符号をさらに備えるＣ３１に記載のコンピュータプログラム製品。
［Ｃ３７］
前記モードの各々に関して、前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数を用いて前記画素ブロックをコーディングするための総コーディングコストを推定するための符号と、
前記複数のモードのうちで最低の推定された総コーディングコストを有するモードを選択するための符号と、
前記選択されたモードを適用して前記画素ブロックをコーディングするための符号と、をさらに備えるＣ３６に記載のコンピュータプログラム製品。
［Ｃ３８］
前記総コーディングコストを推定するための符号は、
前記画素ブロックに関する歪みメトリックを計算するための符号と、
前記画素ブロックの非残差データのコーディングと関連づけられたビット数を計算するための符号と、
少なくとも前記歪みメトリック、前記非残差データのコーディングと関連づけられた前記ビット数及び前記残差データのコーディングと関連づけられた前記ビット数に基づいて前記画素ブロックをコーディングするための前記総コーディングコストを推定するための符号と、を備えるＣ３７に記載のコンピュータプログラム製品。
［Ｃ３９］
前記残差データのコーディングと関連づけられた少なくとも前記推定されたビット数に基づいてコーディングモードを選択するための符号と、
前記コーディングモードを選択後に前記残差データに関する前記変換係数を量子化するための符号と、
前記残差データに関する前記量子化された変換係数を符号化するための符号と、
前記残差データに関する前記符号化された係数を送信するための符号と、をさらに備えるＣ３１に記載のコンピュータプログラム製品。
［Ｃ４０］
前記変換係数の行列を生成するための符号であって、前記変換係数行列の行数は、前記ブロック内における画素の行数と等しく、前記変換係数行列の列数は、前記ブロック内における画素の列数と等しい符号と、
前記変換係数行列をしきい値行列と比較するための符号であって、前記しきい値行列は、前記変換係数行列の次元と同じ次元を有し、前記比較は、１及びゼロの行列が得られ、前記ゼロは、量子化後にゼロになる前記変換係数行列内の位置を表し、前記１は、量子化後にゼロでないままである前記変換係数行列内の位置を表す符号と、
前記１及びゼロの行列内における１の数を合計して量子化されたときにゼロでないままであるとして識別された前記変換係数の数を計算するための符号と、
前記１及びゼロの行列内における前記１の位置に対応する前記変換係数行列内の前記変換係数のうちの少なくとも１つの絶対値を合計するための符号と、
少なくとも前記ゼロでない変換係数の数及び前記少なくとも１つのゼロでない変換係数の和に基づいて前記残差データのコーディングと関連づけられた前記ビット数を推定するための符号と、をさらに備えるＣ３１に記載のコンピュータプログラム製品。
Several aspects and examples have been described. However, various modifications of these examples are possible, and the principles presented herein can be applied to other aspects as well. These and other aspects are within the scope of the following claims.
Hereinafter, the invention described in the scope of claims of the present application will be appended.
[C1]
A method for processing digital video data,
Identifying one or more transform coefficients for pixel block residual data that remains non-zero when quantized;
Estimating the number of bits associated with the coding of the residual data based at least on the identified transform coefficients;
Estimating a coding cost for coding the pixel block based on at least the estimated number of bits associated with coding the residual data.
[C2]
Identifying the transform coefficients comprises identifying the transform coefficients that remain non-zero when quantized by comparing each of the transform coefficients with a corresponding one of a plurality of threshold values. The method of C1, wherein each of the plurality of thresholds is calculated as a function of a quantization parameter (QP).
[C3]
Identifying each transform coefficient that remains non-zero when quantized by comparing each of the transform coefficients with a corresponding one of a plurality of threshold values is less than the corresponding threshold value The method of C2, comprising identifying the transform coefficients as transform coefficients that remain non-zero when quantized.
[C4]
Pre-calculating a plurality of sets of thresholds, each of the sets of thresholds corresponding to a different value of the QP;
The method of C2, further comprising: selecting one of the plurality of threshold sets based on the value of the QP used to encode the pixel block.
[C5]
Estimating the number of bits associated with coding the residual data is
Determining the number of said transform coefficients identified as remaining non-zero when quantized;
Summing the absolute value of at least one of the transform coefficients identified as remaining non-zero when quantized;
Estimating the number of bits associated with the coding of the residual data based on at least the determined number of non-zero transform coefficients and a sum of the absolute values of the at least one non-zero transform coefficient. The method according to C1.
[C6]
Estimating the number of bits associated with coding of the residual data comprises estimating the number of bits required to code the residual data in each of at least two block modes. Estimating the cost comprises, in each of the at least two block modes, estimating the coding cost based on at least the estimated number of bits in each one of the block modes, The method of C1, further comprising selecting one of the block modes based on at least the estimated coding cost for each of the modes.
[C7]
For each of the modes, estimating a total coding cost for coding the pixel block using at least the estimated number of bits associated with coding of the residual data;
Selecting a mode having the lowest estimated total coding cost among the plurality of modes;
Applying the selected mode to coding the pixel block; and the method of C6.
[C8]
Estimating the total coding cost is:
Calculating a distortion metric for the pixel block;
Calculating the number of bits associated with the coding of non-residual data for the pixel block;
The total coding cost for coding the pixel block based on at least the distortion metric, the number of bits associated with the coding of the non-residual data, and the number of bits associated with the coding of the residual data; Estimating the method according to C7.
[C9]
Selecting a coding coding based on at least the estimated number of bits associated with the coding of the residual data;
Quantizing the transform coefficients for the residual data after selecting the coding mode;
Encoding the quantized transform coefficients for the residual data;
Transmitting the encoded coefficients for the residual data; and the method of C1.
[C10]
Generating a matrix of transform coefficients, wherein the number of rows of the transform coefficient matrix is equal to the number of rows of pixels in the block, and the number of columns of the transform coefficient matrix is the number of columns of pixels in the block Is equal to
Comparing the transform coefficient matrix with a threshold matrix, the threshold matrix having the same dimensions as the transform coefficient matrix, and the comparison yields a matrix of 1 and zero; The zero represents a position in the transform coefficient matrix that becomes zero after quantization, and the 1 represents a position in the transform coefficient matrix that remains non-zero after quantization;
Summing the number of ones in the one and zero matrix to calculate the number of transform coefficients identified as remaining non-zero upon quantization;
Summing the absolute value of at least one of the transform coefficients in the transform coefficient matrix corresponding to the 1 position in the 1 and zero matrix;
The method of C1, further comprising: estimating the number of bits associated with coding of the residual data based on at least the number of non-zero transform coefficients and the sum of the at least one non-zero transform coefficient.
[C11]
An apparatus for processing digital video data,
A transform module that generates transform coefficients for the residual data of the pixel block;
Bits identifying one or more of the transform coefficients that remain non-zero when quantized and estimating a number of bits associated with coding of the residual data based at least on the identified transform coefficients An estimation module;
A control module for estimating a coding cost for coding the pixel block based on at least the estimated number of bits associated with coding the residual data.
[C12]
The bit estimation module identifies a transform coefficient that remains non-zero when quantized by comparing each of the transform coefficients with a corresponding one of a plurality of threshold values; Each of which is calculated as a function of a quantization parameter (QP).
[C13]
The apparatus of C12, wherein the bit estimation module identifies the transform coefficients that are less than a corresponding threshold as transform coefficients that remain non-zero when quantized.
[C14]
The bit estimation module pre-calculates a plurality of sets of thresholds, each of the threshold sets corresponding to a different value of the QP, and each of the threshold sets is of the QP The apparatus of C12, corresponding to different values and selecting one of the plurality of threshold sets based on the value of the QP used to encode the pixel block.
[C15]
The bit estimation module determines the number of transform coefficients that are identified as being non-zero when quantized and of the transform coefficients that are identified as remaining non-zero when quantized. The at least one absolute value of and summing the residual data based on at least the determined number of non-zero transform coefficients and the sum of the absolute values of the at least one non-zero transform coefficient The apparatus according to C11, which estimates the number of bits.
[C16]
The bit estimation module estimates the number of bits associated with coding of the residual data in each of at least two block modes;
The control module estimates a coding cost for each of the blocks based on at least the estimated number of bits in each one of the at least two block modes, and at least the estimated for each of the modes. The apparatus of C11, wherein one of the block modes is selected based on a coding cost.
[C17]
The control module estimates, for each of the modes, a total coding cost for coding the pixel block using at least the estimated number of bits associated with coding of the residual data, and the plurality of modes The apparatus of C16, wherein the mode with the lowest estimated total coding cost is selected and the selected mode is applied to code the pixel block.
[C18]
The control module calculates a distortion metric for the pixel block, calculates a number of bits associated with coding of the non-residual data of the pixel block, and is associated with at least the distortion metric, coding of the non-residual data. The apparatus of C17, wherein the total coding cost for coding the pixel block is estimated based on the number of bits and the number of bits associated with the coding of the residual data.
[C19]
A control module that selects a coding mode based on at least the estimated number of bits associated with coding the residual data;
A quantization module for quantizing the transform coefficient for the residual data after selection of the coding mode;
An entropy encoding module that encodes the quantized transform coefficients for the residual data;
The apparatus of C11, further comprising: a transmitter that transmits the encoded coefficients for the residual data.
[C20]
The transform module generates a matrix of transform coefficients, the number of rows of the transform coefficient matrix is equal to the number of rows of pixels in the block, and the number of columns of the transform coefficient matrix is the number of columns of pixels in the block. Equal to the number,
The bit estimation module compares the transform coefficient matrix with a threshold matrix, the threshold matrix having the same dimensions as the transform coefficient matrix, and the comparison yields a matrix of 1 and zero. The zero represents a position in the transform coefficient matrix that becomes zero after quantization, and the one represents a position in the transform coefficient matrix that remains non-zero after quantization;
The bit estimation module calculates the number of transform coefficients identified as remaining non-zero when quantized by summing the number of ones in the one-and-zero matrix; Sum the absolute values of at least one of the transform coefficients in the transform coefficient matrix corresponding to the one position in the matrix, and at least the number of non-zero transform coefficients and the at least one non-zero transform coefficient The apparatus of C11, wherein the number of bits associated with coding of the residual data is estimated based on a sum.
[C21]
An apparatus for processing digital video data,
Means for identifying one or more transform coefficients for pixel block residual data that remains non-zero when quantized;
Means for estimating a number of bits associated with coding of the residual data based at least on the identified transform coefficients;
Means for estimating a coding cost for coding the pixel block based on at least the estimated number of bits associated with coding the residual data.
[C22]
Said means for identifying said transform coefficients that remain non-zero when quantized by comparing each of said transform coefficients with a corresponding one of a plurality of threshold values; Each of C21 is computed as a function of a quantization parameter (QP).
[C23]
The apparatus of C22, wherein the means for identifying identifies the transform coefficients that are less than a corresponding threshold as transform coefficients that remain non-zero when quantized.
[C24]
Means for pre-calculating a plurality of sets of thresholds, wherein each of the threshold sets corresponds to a different value of the QP;
The apparatus of C22, further comprising: means for selecting one of the plurality of threshold sets based on the value of the QP used to encode the pixel block.
[C25]
The means for estimating determines a number of the transform coefficients that are identified as being non-zero when quantized and of the transform coefficients that are identified as being non-zero when quantized. At least one absolute value of and summed with the coding of the residual data based on at least the determined number of non-zero transform coefficients and the sum of the absolute values of the at least one non-zero transform coefficient The apparatus of C21, wherein the number of bits is estimated.
[C26]
The bit estimation means estimates the number of bits associated with the coding of the residual data in each of at least two block modes, and the coding cost estimation means includes one of each of the at least two block modes. Estimating a coding cost for each of the block modes based on at least the estimated number of bits in one, and determining one of the block modes based on at least the estimated number of bits for each of the block modes The apparatus of C21, further comprising means for selecting.
[C27]
And further comprising means for estimating a total coding cost for coding the pixel block using at least the estimated number of bits associated with coding of the residual data for each of the modes. The apparatus according to C26, wherein the means selects a mode having the lowest estimated total coding cost among the plurality of modes.
[C28]
The coding cost estimation means calculates a distortion metric for the pixel block, calculates a number of bits associated with coding of the non-residual data of the pixel block, and at least the distortion metric, coding of the non-residual data The apparatus of C27, wherein the total coding cost for coding the pixel block is estimated based on the number of bits associated with and the number of bits associated with coding of the residual data.
[C29]
Means for selecting a coding mode based on at least the estimated number of bits associated with coding of the residual data;
Means for quantizing the transform coefficients for the residual data after selecting the coding mode;
Means for encoding the quantized transform coefficients for the residual data;
The apparatus of C21, further comprising means for transmitting the encoded coefficients for the residual data.
[C30]
Means for generating a matrix of transform coefficients, wherein the number of rows of the transform coefficient matrix is equal to the number of rows of pixels in the block, and the number of columns of the transform coefficient matrix is the number of pixels in the block; Equal to the number of columns,
The identifying means compares the transform coefficient matrix with a threshold matrix, the threshold matrix having the same dimensions as the transform coefficient matrix, and the comparison is obtained as a matrix of 1 and zero. The zero represents a position in the transform coefficient matrix that becomes zero after quantization, and the one represents a position in the transform coefficient matrix that remains non-zero after quantization;
The means for estimating calculates the number of transform coefficients identified as remaining non-zero when quantized by summing the number of ones in the one-and-zero matrix; Sum the absolute values of at least one of the transform coefficients in the transform coefficient matrix corresponding to the one position in the matrix, and at least the number of non-zero transform coefficients and the at least one non-zero transform coefficient The apparatus of C21, wherein the number of bits associated with coding of the residual data is estimated based on a sum.
[C31]
A computer program product for processing digital video data comprising a computer readable medium having instructions stored thereon, the instructions comprising:
A code for identifying one or more transform coefficients for the residual data of the pixel block that remains non-zero when quantized;
A code for estimating a number of bits associated with coding of the residual data based at least on the identified transform coefficients;
A computer program product comprising: a code for estimating a coding cost for coding the pixel block based on at least the estimated number of bits associated with coding the residual data.
[C32]
The code for identifying the transform coefficient identifies a transform coefficient that remains non-zero when quantized by comparing each of the transform coefficients with a corresponding one of a plurality of threshold values, The computer program product according to C31, wherein each of the plurality of thresholds is calculated as a function of a quantization parameter (QP).
[C33]
A code for identifying a transform coefficient that remains non-zero when quantized by comparing each of the transform coefficients with a corresponding one of a plurality of threshold values is less than the corresponding threshold value The computer program product of C32 comprising a code for identifying the transform coefficient as a transform coefficient that remains non-zero when quantized.
[C34]
A code for pre-calculating a plurality of sets of thresholds, each of the threshold sets being a code corresponding to a different value of the QP;
A computer program product according to C32, further comprising: a code for selecting one of the plurality of threshold sets based on the value of the QP used to encode the pixel block. .
[C35]
The code for estimating the number of bits associated with the coding of the residual data is
A code for determining the number of transform coefficients identified as remaining non-zero when quantized;
A sign for summing the absolute values of at least one of the transform coefficients identified as remaining non-zero when quantized;
A code for estimating the number of bits associated with coding of the residual data based on at least the determined number of non-zero transform coefficients and the absolute value of the at least one non-zero transform coefficient; A computer program product according to C31, comprising:
[C36]
The code for estimating the number of bits associated with the coding of the residual data is a code for estimating the number of bits associated with the coding of the residual data in each of at least two block modes. And a code for estimating the coding cost estimates the coding cost for each of the at least two block nodes based on at least the estimated number of bits in each one of the block modes. A computer program product according to C31, further comprising a code for selecting one of the block modes based on at least the estimated number of bits for each of the block modes.
[C37]
For each of the modes, a code for estimating a total coding cost for coding the pixel block using at least the estimated number of bits associated with coding of the residual data;
A code for selecting a mode having the lowest estimated total coding cost among the plurality of modes;
A computer program product according to C36, further comprising: a code for applying the selected mode to code the pixel block.
[C38]
The code for estimating the total coding cost is:
A code for calculating a distortion metric for the pixel block;
A code for calculating the number of bits associated with the coding of the non-residual data of the pixel block;
Estimating the total coding cost for coding the pixel block based at least on the distortion metric, the number of bits associated with the coding of the non-residual data and the number of bits associated with the coding of the residual data And a computer program product according to C37.
[C39]
A code for selecting a coding mode based on at least the estimated number of bits associated with coding of the residual data;
A code for quantizing the transform coefficients for the residual data after selecting the coding mode;
A code for encoding the quantized transform coefficients for the residual data;
A computer program product according to C31, further comprising: a code for transmitting the encoded coefficients for the residual data.
[C40]
A code for generating a matrix of transform coefficients, wherein the number of rows of the transform coefficient matrix is equal to the number of rows of pixels in the block, and the number of columns of the transform coefficient matrix is the number of pixels in the block. A sign equal to the number of columns,
A code for comparing the transform coefficient matrix with a threshold matrix, the threshold matrix having the same dimension as the transform coefficient matrix, and the comparison is obtained as a matrix of 1 and zero. The zero represents a position in the transform coefficient matrix that becomes zero after quantization, and the 1 represents a position in the transform coefficient matrix that remains non-zero after quantization;
A code for calculating the number of transform coefficients identified as remaining non-zero when quantized by summing the number of ones in the one and zero matrix;
A sign for summing at least one absolute value of the transform coefficients in the transform coefficient matrix corresponding to the position of 1 in the 1 and zero matrices;
The code for estimating the number of bits associated with coding of the residual data based on at least the number of non-zero transform coefficients and the sum of the at least one non-zero transform coefficient, according to C31. Computer program product.

Claims

A method for processing digital video data,
Identifying by the processor one or more transform coefficients for residual data of pixel blocks that are non-zero and remain non-zero when quantized, wherein said identifying is said one or more transform coefficients Executed without quantizing
Generating a matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is the number of pixels in the block. Equal to column,
Comparing the matrix of transform coefficients with a threshold matrix, wherein the threshold matrix has the same dimensions as the matrix of the transform coefficients matrix, and the comparison yields a matrix of 1 and 0; 0 represents a location in the matrix of transform coefficients that becomes zero after quantization, 1 represents a location in the matrix of transform coefficients that remains non-zero after quantization,
Adding the number of 1s in the matrix of 1s and 0s to calculate the number of transform coefficients identified as remaining non-zero when quantized;
Adding at least one absolute value of the transform coefficient in the matrix of transform coefficients corresponding to a location of 1 in the 1 and 0 matrix;
Estimating the number of bits associated with the coding of the residual data based on at least the number of non-zero transform coefficients and the addition of the at least one non-zero transform coefficient;
Estimating a coding cost for coding the pixel block based on the following equation:
J = D + λmode (Rresidual + Rother)
Where J is the estimated coding cost, D is the distortion metric of the block, λmode is the Lagrange multiplier for each of the multiple coding modes, and Residual is the coding for residual data using context adaptive coding techniques. Wherein Rother represents the number of bits for coding other block data using a fixed length coding (FLC) technique or a universal variable length coding (VLC) technique.

The method of claim 1, wherein each of the threshold values of the threshold matrix is calculated as a function of a quantization parameter (QP).

The method of claim 2, wherein locations in the matrix of transform coefficients that remain non-zero after quantization correspond to transform coefficients that satisfy respective corresponding thresholds.

Calculating a plurality of threshold matrices in advance, each of the threshold matrices corresponding to a different value of the QP;
The method of claim 2, further comprising: selecting the threshold matrix from the plurality of threshold matrices based on the value of the QP used to encode the pixel block.

In a device that processes digital video data,
A transform module for generating transform coefficients for residual data of a pixel block and generating a matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block; The number of columns of the matrix of transform coefficients is equal to the number of columns of pixels in the block;
Identifying one or more of the transform coefficients that are not quantized and non-zero and remain non-zero when quantized;
Comparing the matrix of transform coefficients with a threshold matrix, wherein the threshold matrix has the same dimensions as the matrix of the transform coefficients matrix, and the comparison yields a matrix of 1 and 0; 0 represents a location in the matrix of transform coefficients that becomes 0 after quantization, 1 represents a location in the matrix of transform coefficients that remains non-zero after quantization,
Adding the number of 1's in the 1's and 0's matrix and calculating the number of said transform coefficients identified to remain non-zero when quantized;
Adding at least one absolute value of the transform coefficients in the matrix of transform coefficients corresponding to a location of 1 in the 1 and 0 matrices;
Estimating the number of bits associated with coding of the residual data based on at least the number of non-zero transform coefficients and the addition of the at least one non-zero transform coefficient;
A bit estimation module;
A control module for estimating a coding cost for coding the pixel block based on the following equation:
J = D + λmode (Rresidual + Rother)
Where J is the estimated coding cost, D is the distortion metric of the block, λmode is the Lagrange multiplier for each of the multiple coding modes, and Residual is the coding for residual data using context adaptive coding techniques. Wherein Rother represents the number of bits for coding other block data using fixed length coding (FLC) or universal variable length coding (VLC) techniques.

6. The apparatus according to claim 5, wherein each threshold of the threshold matrix is calculated as a function of a quantization parameter (QP).

The locations of the matrix of transform coefficients that remain non-zero after quantization correspond to transform coefficients that satisfy respective corresponding thresholds;
The apparatus according to claim 6.

The bit estimation module pre-calculates a plurality of sets of thresholds, each of the threshold sets corresponding to a different value of the QP and used to encode the pixel block. 7. The apparatus of claim 6, wherein one of the plurality of threshold sets to be used as the threshold matrix is selected based on the value of.

In an apparatus for processing digital video data,
Means for identifying one or more transform coefficients for the residual data of a pixel block that is non-zero and not quantized and remains non-zero when quantized;
Means for generating a matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is the number of pixels in the block; Equal to the number of columns, the identification means compares the matrix of transform coefficients with a threshold matrix, the threshold matrix having the same dimensions as the matrix of the transform coefficients, and the comparison is between 1 and 0 Yielding a matrix, where 0 represents the location in the matrix of transform coefficients that is zero after quantization, and 1 represents the location in the matrix of transform coefficients that remains non-zero after quantization,
Calculating the number of transform coefficients identified as remaining non-zero when quantized by adding the number of 1s in the 1 and 0 matrices;
Adding at least one absolute value of the transform coefficients in the matrix of transform coefficients corresponding to a location of 1 in the 1 and 0 matrices;
Associated with the coding of the residual data by estimating the number of bits associated with the coding of the residual data based on the addition of at least the non-zero transform coefficients and the at least one non-zero transform coefficient. Means for estimating the number of bits taken;
Means for estimating a coding cost for coding the pixel block based on:
J = D + λmode (Rresidual + Rother)
Where J is the estimated coding cost, D is the distortion metric of the block, λmode is the Lagrange multiplier for each of the multiple coding modes, and Residual is the coding for residual data using context adaptive coding techniques. Wherein Rother represents the number of bits for coding other block data using fixed length coding (FLC) or universal variable length coding (VLC) techniques.

10. The apparatus of claim 9, wherein each of the threshold values of the threshold matrix is calculated as a function of a quantization parameter (QP).

10. The apparatus of claim 9, wherein a location of the matrix of transform coefficients that remains non-zero after quantization corresponds to a transform coefficient that satisfies a respective corresponding threshold.

Means for pre-calculating a plurality of sets of thresholds, wherein each of the threshold sets corresponds to a different value of the QP;
Means for selecting one of the plurality of threshold sets for use in the threshold matrix based on the value of the QP used to encode the pixel block; The apparatus of claim 10, further comprising:

A computer-readable recording medium having instructions stored thereon for processing digital video data that can be executed by a computer, the instructions comprising:
A code for identifying one or more transform coefficients for the residual data of the pixel block that is not quantized and non-zero and remains non-zero when quantized;
A code for generating the matrix of transform coefficients, wherein the number of rows of the matrix of transform coefficients is equal to the number of rows of pixels in the block, and the number of columns of the matrix of transform coefficients is Equal to the number of columns of
A code for comparing the matrix of transform coefficients with a threshold matrix, wherein the threshold matrix has the same dimensions as the matrix of the transform coefficients, and the comparison is a matrix of 1 and 0 Occurs, 0 represents a location in the matrix of transform coefficients that becomes zero after quantization, 1 represents a location in the matrix of transform coefficients that remains non-zero after quantization,
A code for adding the number of 1s in the 1 and 0 matrix to calculate the number of transform coefficients identified to remain non-zero when quantized;
A code for adding at least one absolute value of the transform coefficient in the matrix of transform coefficients corresponding to a location of 1 in the 1 and 0 matrix;
Code for estimating at least a few of the non-zero transform coefficients, and the number of bits associated with coding the residual data based on the addition of the transform coefficients the at least one nonzero,
A code for estimating a coding cost for coding the pixel block based on the following equation:
J = D + λmode (Rresidual + Rother)
Where J is the estimated coding cost, D is the distortion metric of the block, λmode is the Lagrange multiplier for each of the multiple coding modes, and Residual is the coding for residual data using context adaptive coding techniques. of the number of bits, Rother comprises, it represents the number of bits for coding the other block data by using a fixed length coding (FLC) technique or a universal variable length coding (VLC) technique, a computer-readable recording medium.

Each of the threshold of the threshold matrix, a computer-readable recording medium of claim 13, which is calculated as a function of quantization parameter (QP).

Remain non-zero the location of the matrix of transform coefficients corresponding to the transform coefficients that satisfy the respective corresponding threshold after quantization, a computer-readable recording medium according to claim 14.

A code for pre-calculating a plurality of sets of thresholds, each of the sets of thresholds corresponding to a different value of the QP;
To select one of the plurality of threshold sets to be used to add the threshold matrix based on the value of the QP used to encode the pixel block the computer-readable medium of claim 14, further comprising a code, the.

Estimating the number of bits associated with the coding of the residual data comprises estimating the number of bits that the residual data needs to be encoded in each of at least two block modes, the coding Estimating the cost comprises estimating the coding cost in each of the at least two block modes based on at least the number of bits estimated in each one of the block modes, and at least the mode 2. The method of claim 1, comprising selecting one of the block modes based on the estimated coding cost for each of.

For each of the modes, estimating a total coding cost for coding the pixel block using at least the estimated number of bits associated with coding of the residual data;
Selecting one of the plurality of modes having the least estimated total coding cost;
18. The method of claim 17, further comprising applying the selected mode to encode the pixel block.

Estimating the total coding cost is
Calculating a distortion metric for the pixel block;
Calculating the number of bits associated with encoding the non-residual data of the pixel block;
The total coding cost for encoding the pixel block based at least on the distortion metric, the number of bits associated with encoding the non-residual data, and the number of bits associated with encoding the residual data Estimating
The method of claim 18 comprising:

Selecting an encoding mode based on at least the estimated number of bits associated with encoding of the residual data;
After selecting the encoding mode, quantizing the transform coefficients for the residual data;
Encoding the quantized transform coefficients for the residual data;
Transmitting the encoded coefficients for the residual data;
The method of claim 1, further comprising:

The bit estimation module estimates a number of bits associated with encoding of the residual data in each of at least two block modes;
The control module estimates a coding cost for each of the at least two block modes based at least on the estimated number of bits in each one of the block modes, and the estimated coding cost for each of the modes 6. The apparatus of claim 5, wherein one of the block modes is selected based at least on.

The control module estimates, for each of the modes, a total coding cost for encoding the pixel block using at least the estimated number of bits associated with the encoding of the residual data, with a minimum The apparatus of claim 21, wherein one of the plurality of modes having an estimated total coding cost is selected and the selected mode is applied to encode the pixel block.

The control module calculates a distortion metric for the pixel block, calculates a number of bits associated with the encoding of the non-residual data of the pixel block, and associates the distortion metric with the encoding of the non-residual data. 23. The apparatus of claim 22, wherein a total coding cost for encoding the pixel block is estimated based at least on the number of bits assigned and the number of bits associated with encoding the residual data.

A control module that selects an encoding mode based at least on the estimated number of bits associated with encoding the residual data;
A quantization module that quantizes the transform coefficients for the residual data after selection of the encoding mode;
An entropy encoding module that encodes the quantized transform coefficients for the residual data;
A transmitter for transmitting the encoded coefficients for the residual data;
6. The apparatus of claim 5, further comprising:

The means for estimating the number of bits associated with the coding of the residual data estimates the number of bits associated with the coding of the residual data in each of at least two block modes, and the coding cost estimation The means estimates a coding cost for each of at least two block modes based on at least the estimated number of bits in each one of the block modes, and based on at least the estimated number of bits for each of the modes 10. The apparatus of claim 9, further comprising means for selecting one of the block modes.

For each of said modes, further comprising means for estimating the total coding cost for coding the pixel block using the number of bits the estimated associated with the encoding of at least said residual data, said mode It means for selecting one of, for selecting one of the plurality of modes having the smallest estimated total coding cost, apparatus according to claim 25.

The means for estimating the total coding cost calculates a distortion metric for the pixel block, calculates a number of bits associated with encoding of the non-residual data of the pixel block, and at least the distortion metric, the non-residual 27. The apparatus of claim 26, wherein a total coding cost for encoding the pixel block is estimated based on a number of bits associated with encoding data and a number of bits associated with encoding residual data.

Means for selecting a coding mode based on at least the estimated number of bits associated with coding of the residual data;
Means for quantizing the transform coefficients for the residual data after selecting the encoding mode;
Means for encoding the quantized transform coefficients for the residual data; and means for transmitting the encoded coefficients for the residual data;
10. The apparatus of claim 9, further comprising:

The code for estimating the number of bits associated with the encoding of the residual data is a code for estimating the number of bits associated with the encoding of the residual data in each of at least two block modes. And the code for estimating the coding cost comprises a code for estimating the coding cost for each of the at least two block modes based on at least the estimated number of bits in each of the block modes. further comprises code for selecting one of the block mode on the basis of at least each number of bits the estimated regarding the mode, the computer-readable recording medium according to claim 13.

For each of the modes, a code for estimating a total coding cost for encoding the pixel block using at least the estimated number of bits associated with encoding of the residual data;
A code for selecting one of the plurality of modes having a minimum estimated total coding cost, and a code for applying the selected mode to encode the pixel block;
Further comprising a computer readable medium of claim 29.

The code for estimating the total coding cost is:
Code for calculating a distortion metric for the pixel block;
A code for calculating the number of bits associated with encoding the non-residual data of the pixel block;
Estimating the total coding cost for encoding the pixel block based on at least the distortion metric, the number of bits associated with the encoding of the non-residual data, and the number of bits associated with the encoding of the residual data And code to do
Comprising a computer readable medium of claim 30.

A code for selecting an encoding mode based on at least the estimated number of bits associated with encoding of the residual data;
A code for quantizing the transform coefficient for the residual data after selecting the encoding mode;
The code of claim 13, further comprising: a code for encoding the quantized transform coefficient for the residual data; and a code for transmitting the encoded coefficient for the residual data. computer readable recording medium.