JP2008503919A

JP2008503919A - Method and apparatus for optimizing video coding

Info

Publication number: JP2008503919A
Application number: JP2007516535A
Authority: JP
Inventors: マイケルトウラピス，アレグザンドロス; マクドナルドボイス，ジル; イン，ペング
Original assignee: Thomson Licensing SAS
Current assignee: Thomson Licensing SAS
Priority date: 2004-06-18
Filing date: 2005-06-06
Publication date: 2008-02-07
Also published as: WO2006007285A1; CN1969558A; EP1766991A1; MY149954A; BRPI0512057A

Abstract

複数のピクチャに対応するビデオ信号データを符号化するエンコーダ、およびそれに対応する方法を提供する。このエンコーダは、ビデオ信号データに対応する複数のピクチャの少なくとも一部に関する複数の重複解析ウィンドウを用いてビデオ信号データのビデオ解析を実行し、ビデオ解析の結果に基づいてビデオ信号データの符号化パラメータの適応化を行う重複ウィンドウ解析ユニット（３１０）を含む。 An encoder for encoding video signal data corresponding to a plurality of pictures and a method corresponding thereto are provided. The encoder performs video analysis of the video signal data using a plurality of overlap analysis windows regarding at least a part of the plurality of pictures corresponding to the video signal data, and encodes the video signal data encoding parameters based on the result of the video analysis. A duplicate window analysis unit (310) that performs the adaptation of

Description

（関連出願の相互参照）
本願は、２００４年６月１８日出願の米国仮特許出願第６０／５８１２８０号の特典を請求するものである。 (Cross-reference of related applications)
This application claims the benefit of US Provisional Patent Application No. 60/581280, filed Jun. 18, 2004.

本発明は、一般にビデオ・エンコーダおよび・ビデオ・デコーダに関し、特にビデオ符号化を最適化する方法および装置に関する。 The present invention relates generally to video encoders and video decoders, and more particularly to a method and apparatus for optimizing video coding.

マルチパス・ビデオ符号化方式は、ＭＰＥＧ−２やＪＶＴ／Ｈ．２６４／ＭＰＥＧＡＶＣなど多くのビデオ符号化アーキテクチャで、符号化効率を高めるために使用されている。これらの方式の基礎となる概念は、解析を実行し、符号化性能を向上させるために将来の反復に用いることができる統計値を収集しながら、シーケンス全体を数回反復して試験して符号化することである。 Multi-pass video coding schemes include MPEG-2 and JVT / H. It is used in many video coding architectures such as H.264 / MPEG AVC to increase coding efficiency. The underlying concept of these schemes is that the entire sequence is tested several times while performing analysis and collecting statistics that can be used in future iterations to improve coding performance. It is to become.

２パス符号化方式は、ＭＩＣＲＯＳＯＦＴ（登録商標）ＷＩＮＤＯＷＳＭＥＤＩＡ（登録商標）やＲＥＡＬＶＩＤＥＯ（登録商標）エンコーダなど、いくつかの符号化システムで既に使用されている。この符号化方式によれば、エンコーダは、最初にいくつかの初期既定設定を用いてシーケンス全体に対して初期符号化パスを実行し、シーケンス内の各ピクチャの符号化効率に関する統計値を収集する。このプロセスが完了した後で、以前に生成された統計値を同時に考慮に入れながら、もう一度シーケンス全体を処理し、符号化する。これにより、符号化効率をかなり向上させることができ、さらに例えば符号化されたストリームの所与のビットレート制約を満たすなど、いくつかの規定の符号化制約または要件を満たすことも可能になる。これは、エンコーダがビデオ・シーケンスまたはピクチャの特性をより多く把握しており、従って、エンコーダが符号化に使用される量子化器やデッド・ゾーン処理などのパラメータをより適切に選択できるからである。この最初の符号化パスの間に収集することができ、この目的のために使用することができる統計値としては、ビット／ピクチャ（ｂｉｔｓｐｅｒｐｉｃｔｕｒｅ）、空間アクティビティ（すなわち平均正規化マクロブロック分散および平均）、時間アクティビティ（すなわち動きベクトル／動きベクトル分散）、歪み（例えば二乗平均誤差（ＭＳＥ））などが挙げられる。これらの方法を用いることによって符号化の性能をかなり向上させることができるが、これらの方法は非常に複雑になる傾向があり、オフラインでしか用いることができず（シーケンス全体を最初に符号化し、その後で第２のパスを実行する。）、そのため、実時間エンコーダに適さない。これらの方法は、最初の符号化ステップから推測可能な全ての生じうる統計値を必ずしも考慮しているわけではない。 The two-pass encoding scheme is already used in several encoding systems such as MICROSOFT (registered trademark) WINDOWS MEDIA (registered trademark) and REALVIDEO (registered trademark) encoder. According to this coding scheme, the encoder first performs an initial coding pass on the entire sequence with some initial default settings and collects statistics on the coding efficiency of each picture in the sequence. . After this process is complete, the entire sequence is once again processed and encoded while simultaneously taking into account previously generated statistics. This can significantly improve the coding efficiency and can also satisfy some specified coding constraints or requirements, for example meeting a given bit rate constraint of the encoded stream. This is because the encoder knows more about the characteristics of the video sequence or picture, and therefore the encoder can better select parameters such as the quantizer and dead zone processing used for encoding. . Statistics that can be collected during this first coding pass and that can be used for this purpose include bits per picture, spatial activity (ie average normalized macroblock variance and Average), temporal activity (ie, motion vector / motion vector variance), distortion (eg, root mean square error (MSE)), and the like. Although using these methods can significantly improve the performance of the encoding, these methods tend to be very complex and can only be used offline (the entire sequence is encoded first, After that, the second pass is performed.) Therefore, it is not suitable for a real-time encoder. These methods do not necessarily take into account all possible statistics that can be inferred from the initial encoding step.

（発明の概要）
本発明は、ビデオ符号化を最適化する方法および装置に関し、上述した従来技術および他の先行技術の欠点および問題点に対処するものである。 (Summary of Invention)
The present invention relates to a method and apparatus for optimizing video coding, which addresses the shortcomings and problems of the prior art and other prior art described above.

本発明の一態様によれば、複数のピクチャに対応するビデオ信号データを符号化するエンコーダが提供される。このエンコーダは、ビデオ信号データに対応する複数のピクチャの少なくとも一部に関する複数の重複解析ウィンドウを用いてビデオ信号データのビデオ解析を実行し、ビデオ解析の結果に基づいてビデオ信号データの符号化パラメータの適応化を行う重複ウィンドウ解析ユニットを含む。 According to an aspect of the present invention, an encoder is provided that encodes video signal data corresponding to a plurality of pictures. The encoder performs video analysis of the video signal data using a plurality of overlap analysis windows regarding at least a part of the plurality of pictures corresponding to the video signal data, and encodes the video signal data encoding parameters based on the result of the video analysis. Includes a duplicate window analysis unit that performs

本発明の別の態様によれば、複数のピクチャに対応するビデオ信号データを符号化する方法が提供される。この方法は、ビデオ信号データに対応する複数のピクチャの少なくとも一部に関する複数の重複解析ウィンドウを用いてビデオ信号データのビデオ解析を実行するステップと、ビデオ解析の結果に基づいてビデオ信号データの符号化パラメータの適応化を行うステップとを含む。 According to another aspect of the invention, a method for encoding video signal data corresponding to a plurality of pictures is provided. The method includes performing video analysis of the video signal data using a plurality of overlap analysis windows for at least some of the plurality of pictures corresponding to the video signal data, and encoding the video signal data based on the results of the video analysis. Adapting the optimization parameters.

本発明についての上記および他の態様、特徴および利点は、以下の例示的な実施形態の詳細な説明を添付の図面と併せて読むことにより明らかになるであろう。 The above and other aspects, features and advantages of the present invention will become apparent upon reading the following detailed description of exemplary embodiments in conjunction with the accompanying drawings.

本発明は、後述の例示的な図面によってより深く理解することができる。 The invention can be better understood with reference to the following exemplary drawings.

本発明は、ビデオ符号化を最適化する方法および装置に関する。本発明の有利な点は、ビデオ・エンコーダが所与のビットレートで大幅に改善された主観的品質および客観的品質でビデオ・シーケンスを圧縮することを可能にすることである。これは、これから符号化する後続のＮ個のピクチャと比較して現在のピクチャの簡単な解析を行う、ビデオ・シーケンスの特殊な処理によって実現される。その後、エンコーダがこの解析結果を利用して、現在のピクチャの符号化に使用される符号化パラメータ（ピクチャ・タイプ／スライス・タイプ、量子化器、しきい値処理パラメータ、ラグランジュλなどがあるが、これらに限定されるわけではない。）をより良好に決定することができる。本発明は、符号化性能を向上させるためにシーケンス全体のデュアルパスまたはマルチパス符号化を行う従来技術のいくつかのシステムとは異なり、本発明は比較的簡単であり、従って複雑性に及ぼす影響も比較的小さい。本発明の原理をその他のマルチパス符号化ストラテジと共に使用して、効率をさらに高めることもできる。また同様に、一般的なシステム（以前に符号化されたＭ個のピクチャを用いる。）を形成することもできる。 The present invention relates to a method and apparatus for optimizing video coding. An advantage of the present invention is that it allows a video encoder to compress a video sequence with significantly improved subjective and objective quality at a given bit rate. This is achieved by a special processing of the video sequence that provides a simple analysis of the current picture compared to the subsequent N pictures to be encoded. The encoder then uses this analysis result to provide the encoding parameters (picture type / slice type, quantizer, thresholding parameters, Lagrange λ, etc.) used to encode the current picture. , But not limited to these) can be determined better. Unlike the prior art systems that do dual-pass or multi-pass coding of entire sequences to improve coding performance, the present invention is relatively simple and therefore has an impact on complexity. Is also relatively small. The principles of the present invention can also be used with other multi-pass coding strategies to further increase efficiency. Similarly, a general system (using previously encoded M pictures) can also be formed.

本発明の原理によれば、シーケンス全体のうちピクチャ・ウィンドウが重複する部分のみを最初に解析する。これにより生成された統計値に基づいて、各ピクチャの符号化パラメータを適宜調節する。これらの符号化パラメータとしては、ピクチャ・タイプ／スライス・タイプ決定（Ｉ、Ｐ、Ｂ）、フレーム／フィールド決定、Ｂピクチャ間隔、ピクチャまたはマクロブロック量子化値（ＱＰ）、係数しきい値処理、ラグランジュ・パラメータ、クロマ・オフセット、重み付き予測、参照ピクチャ選択、マルチ・ブロック・サイズ決定、エントロピ・パラメータの初期化、イントラ・モード決定、およびデブロッキング・フィルタ・パラメータなどがあるが、これらに限定されるわけではない。解析方法は、様々な複雑性コストを要する可能性があるが、その解析方法をピクチャ／マクロブロック解析を実行するために使用することができる。その解析方法として、最初のパスの完全符号化、空間解析を用いた単純な最初のパスの動き推定、または分散や画像差などを含むがこれに限定されない単純な時空間的解析メトリクスなどがある。さらに、重複するピクチャ・ウィンドウ（およびその重複するピクチャ）は、必要なだけ大きくすることも小さくすることも（多くすることも少なくすることも）できるので、遅延と性能の兼ね合いを様々につけることができる。 In accordance with the principles of the present invention, only the portion of the entire sequence where picture windows overlap is first analyzed. Based on the statistical value thus generated, the encoding parameter of each picture is adjusted as appropriate. These coding parameters include picture type / slice type determination (I, P, B), frame / field determination, B picture interval, picture or macroblock quantization value (QP), coefficient thresholding, Lagrange parameters, chroma offset, weighted prediction, reference picture selection, multi-block sizing, entropy parameter initialization, intra mode determination, and deblocking filter parameters, but are not limited to these It is not done. The analysis method may require various complexity costs, but the analysis method can be used to perform picture / macroblock analysis. Analysis methods include full first pass coding, simple first pass motion estimation using spatial analysis, or simple spatio-temporal analysis metrics including but not limited to variance and image differences. . In addition, overlapping picture windows (and their overlapping pictures) can be made as small or large as necessary (more or less), so there are various tradeoffs between delay and performance. Can do.

本明細書では、本発明の原理について説明する。従って、当業者なら、本明細書では明示的に説明または図示していないが本発明の趣旨および範囲に含まれる本発明の原理を実施する様々な構成を考案することができることを理解されたい。 In this specification, the principle of the present invention will be described. Accordingly, those skilled in the art will appreciate that various configurations can be devised that implement the principles of the invention that are not explicitly described or illustrated herein, but are within the spirit and scope of the invention.

本明細書に記載する全ての実施例および条件に関する表現は、本発明の原理と発明者による技術の進歩の助けとなる概念とを読者が理解するのを助けるための教育的な目的を有するものであり、これらの具体的に列挙した実施例および条件に限定されないものと解釈されたい。 All examples and conditions expressed herein are for educational purposes to help the reader understand the principles of the invention and the concepts that will help the inventor to advance the technology. And should not be construed as being limited to these specifically recited examples and conditions.

さらに、本発明の原理、態様および実施形態ならびに本発明の具体的な実施例についての本明細書の全ての記述は、その構造的均等物および機能的均等物の両方を含むものとする。さらに、これらの均等物には、現在既知の均等物と将来開発されるであろう均等物の両方が含まれる、すなわちその構造にかかわらず同じ機能を実行する、将来開発される任意の要素が含まれるものとする。 Moreover, all statements herein reciting principles, aspects and embodiments of the invention and specific examples of the invention are intended to include both structural and functional equivalents thereof. In addition, these equivalents include both currently known equivalents and equivalents that will be developed in the future, i.e. any future developed elements that perform the same function regardless of their structure. Shall be included.

従って、例えば、当業者なら、本明細書に示すブロック図が本発明の原理を実施する例示的な回路の概念図を表していることを理解するであろう。同様に、任意のフローチャート、流れ図、状態遷移図、擬似コードなどが、コンピュータ可読媒体中に実質的に表現され、明示してある場合も明示していない場合もあるコンピュータまたはプロセッサによって実行される様々なプロセスを表すことも理解されたい。 Thus, for example, those skilled in the art will appreciate that the block diagrams presented herein represent conceptual diagrams of exemplary circuits that implement the principles of the invention. Similarly, any flowchart, flowchart, state transition diagram, pseudo-code, etc. may be substantially expressed in a computer-readable medium and executed by a computer or processor that may or may not be explicitly indicated. It should also be understood to represent a complex process.

図面に示す様々な要素の機能は、専用のハードウェア、および適当なソフトウェアと連動してソフトウェアを実行することができるハードウェアを使用して実現することができる。プロセッサによって実現するときには、これらの機能は単一の専用プロセッサで実現することも、単一の共用プロセッサで実現することも、あるいはその一部を共用することもできる複数の個別プロセッサで実施することもできる。さらに、「プロセッサ」または「制御装置」という用語を明示的に用いていても、ソフトウェアを実行することができるハードウェアのみを指していると解釈すべきではなく、ディジタル信号プロセッサ（ＤＳＰ）ハードウェア、ソフトウェアを記憶するための読取り専用メモリ（ＲＯＭ）、ランダム・アクセス・メモリ（ＲＡＭ）および不揮発性記憶装置（これらに限定されない）を暗に含むことがある。 The functions of the various elements shown in the figures can be realized using dedicated hardware and hardware capable of executing software in conjunction with appropriate software. When implemented by a processor, these functions can be implemented by a single dedicated processor, by a single shared processor, or by multiple individual processors that can share part of them. You can also. Furthermore, the explicit use of the terms “processor” or “controller” should not be construed to refer only to hardware capable of executing software, but digital signal processor (DSP) hardware. , Implicitly including, but not limited to, read only memory (ROM), random access memory (RAM) and non-volatile storage for storing software.

従来の、且つ／または特注のその他ハードウェアも含まれることがある。同様に、図面に示す任意のスイッチも、概念的なものに過ぎない。スイッチの機能は、プログラム論理の動作、専用論理またはプログラム制御と専用論理の相互作用によって、あるいは手作業で実施することができ、作製者が前後関係から具体的に判断して特定の技術を選択することができる。 Conventional and / or custom hardware may also be included. Similarly, any switches shown in the drawings are conceptual only. The function of the switch can be implemented by program logic operation, dedicated logic or interaction between program control and dedicated logic, or manually, and the creator chooses a specific technology based on specific context. can do.

本明細書の特許請求の範囲において、特定の機能を実行する手段として表現されている任意の要素は、例えば（ａ）当該機能を実行する回路素子の組合せ、（ｂ）ファームウェアやマイクロコードなどを含む任意の形態のソフトウェアを当該ソフトウェアを実行して当該機能を実行する適当な回路と組み合わせたものを含む、当該機能を実行する任意の方法を含むものとする。特許請求の範囲に定義される本発明は、列挙する様々な手段が実施する機能を、特許請求の範囲が要求するかたちで組み合わせることにある。従って、出願人は、これらの機能を実施することができるあらゆる手段を、本明細書に示す手段の均等物とみなす。 In the claims of this specification, an arbitrary element expressed as a means for executing a specific function is, for example, (a) a combination of circuit elements that execute the function, (b) firmware, microcode, or the like. Any method of performing the function is included, including any form of software in combination with appropriate circuitry that executes the software to perform the function. The invention as defined in the claims lies in combining the functions performed by the various means recited in the manner required by the claims. Applicant thus regards any means which can carry out these functions as equivalent to those shown herein.

本発明の原理によれば、各パス中にビデオ・シーケンス全体または個別のウィンドウを考慮する従来の方法と異なり、重複するウィンドウに対して各パスを実行して、事前に決定した特徴を隣接するウィンドウの間で再利用できるようにする新しいマルチパス符号化アーキテクチャが開示される。このアーキテクチャでは、はるかに少ないステップ数で最適な符号化を行うことができるので、より低いコストやより低い複雑性であっても、より少ないメモリ要件やより短い待ち時間で、ビデオ品質の大幅な向上などのマルチパス符号化の利点も得ることができる。この特徴は、隣接するウィンドウ間の類似性により最初のパス中でもエンコーダが最良のパラメータを決定することができるため、最終的な符号化のためにさらなる反復を必要としないことを考えると、実時間符号化に適用したときに特に重要である。 In accordance with the principles of the present invention, unlike conventional methods that consider the entire video sequence or individual windows during each pass, each pass is performed on overlapping windows to adjoin predetermined features. A new multi-pass coding architecture is disclosed that allows reuse between windows. This architecture allows optimal encoding with much fewer steps, so video quality can be significantly improved with lower memory requirements and lower latency, even at lower costs and lower complexity. Advantages of multi-pass coding such as improvement can also be obtained. This feature is real-time considering that the encoder can determine the best parameters even during the first pass due to the similarity between adjacent windows, so that no further iterations are required for final encoding. This is particularly important when applied to encoding.

図１を参照すると、ウィンドウ・ベース・２パス符号化アーキテクチャの全体を参照番号１００で示してある。処理／解析ウィンドウの大きさはピクチャのＷ_ｐ個分であり、隣接する２つのグループの間で許される重複の大きさはピクチャのＷ_ｏ個分である。第１のウィンドウの処理を行うと、このウィンドウ内の全てのフレームの符号化特性の事前セットを決定するために使用することができるいくつかの初期統計値が得られる。より詳細には、２パス方式を用いる場合には、次のウィンドウに属さない全てのフレームを、生成されたパラメータに基づいて直ちに符号化することができる。ただし、この情報は、この次のウィンドウの処理／解析に直接使用することができる。例えば、これらのパラメータは、このウィンドウの処理中に初期シードとして使用することができ、大部分のシーケンスにおいて高い時間的相関が存在していることを考えれば、解析を改善することができる。さらに重要なことに、Ｗ_ｏの選択により直前のウィンドウにも属しているこのウィンドウの初期フレームに対して使用される符号化パラメータを、新たに生成された統計値に基づいてさらに洗練／調整することができる。これにより、基本的に、例えばシーケンス全体またはＭ個の隣接するウィンドウを処理した後など、多数の繰返し／パスを用いた場合に、より速く最適な解に収束させることができる。この時間的ウィンドウは、エンコーダの能力または要件に応じてできる限り大きくすることも小さくすることもでき、異なるウィンドウ・サイズ（Ｗ_ｏおよびＷ_ｐより大きくても小さくてもよい）でこの方式を反復することもできることは明らかである。 Referring to FIG. 1, the entire window-based two-pass coding architecture is indicated by reference numeral 100. The size of the processing / analysis window is W _p pieces worth of pictures, the size of the overlap allowed between two adjacent groups is W _o pieces worth of pictures. Processing the first window yields some initial statistics that can be used to determine a pre-set of coding characteristics for all frames in this window. More specifically, when using the 2-pass scheme, all frames that do not belong to the next window can be immediately encoded based on the generated parameters. However, this information can be used directly for processing / analysis of this next window. For example, these parameters can be used as an initial seed during the processing of this window, and the analysis can be improved given that there is a high temporal correlation in most sequences. More importantly, the coding parameters used for the initial frame of the window which belongs to the previous window by selecting the W _o, further refine / adjusted based on the newly generated statistics be able to. This basically allows faster convergence to the optimal solution when multiple iterations / passes are used, for example after processing the entire sequence or M adjacent windows. This temporal window can be as large or small as possible depending on the encoder's capabilities or requirements, and this scheme is repeated with different window sizes (which can be larger or smaller than W _o and W _p ). Obviously you can do that too.

本願発明者等のマルチパス方式の事前解析ステップでは、数多くの異なる基準を使用することができる。これらの基準は、エンコーダ・アーキテクチャの複雑性の制約によって決めることができ、単純な時空間的方法（エッジ検出、テクスチャ解析メトリクス、および絶対画像差などを含むがこれらに限定されない。）から、より複雑なストラテジ（離散的コサイン変換（ＤＣＴ）解析、最初のパスのイントラ符号化、動き推定／動き補償、および完全符号化などを含むがこれらに限定されない。）まで考慮することができる。解析および／または重複するウィンドウを増加または減少させることによって、待ち時間も調節することができる。 A number of different criteria can be used in the multipath pre-analysis step of the inventors. These criteria can be determined by the complexity constraints of the encoder architecture, and from simple spatio-temporal methods (including but not limited to edge detection, texture analysis metrics, and absolute image differences). Even complex strategies can be considered, including but not limited to discrete cosine transform (DCT) analysis, first pass intra coding, motion estimation / compensation, and full coding. Latency can also be adjusted by increasing or decreasing the analysis and / or overlapping windows.

このようなシステムの例として、この解析中に、以下の基準を計算することができる。 As an example of such a system, the following criteria can be calculated during this analysis:

ウィンドウＷ_ｐ内の全てのピクチャｋに対して、以下を計算する。 For all picture k in the window W _p, calculates the following.

（ｉ）位置（ｉ，ｊ）の各マクロブロックに対して、平均値ＭＢｍｅａｎ（ｋ，ｉ，ｊ）を次のように計算する。 (I) The average value MBmean (k, i, j) is calculated as follows for each macroblock at the position (i, j).

（ｉｉ）二乗平均値ＭＢｓｑｍｅａｎ（ｋ，ｉ，ｊ）を次のように計算する。 (Ii) The root mean square value MBsqmean (k, i, j) is calculated as follows.

（ｉｉｉ）分散値ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）を次のように計算する。 (Iii) The variance value MBvariance (k, i, j) is calculated as follows.

ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＝ＭＢｓｑｍｅａｎ（ｋ，ｉ，ｊ）−（ＭＢｍｅａｎ（ｋ，ｉ，ｊ））^２ MBvariance (k, i, j) = MBsqmean (k, i, j) − (MBmean (k, i, j)) ²

（ｉｖ）ピクチャ全体に対して、平均マクロブロック平均値ＡＭＭ_ｋを次のように計算する。 (Iv) The average macroblock average value AMM _k is calculated for the entire picture as follows.

（ｖ）平均マクロブロック分散ＡＭＶ_ｋを次のように計算する。 (V) Calculate the average macroblock variance AMV _k as follows:

（ｖｉ）ピクチャ分散ＰＶ_ｋを次のように計算する。 (Vi) Calculate the picture variance PV _k as follows:

ここで、ｃ［ｘ，ｙ］は位置（ｘ，ｙ）におけるピクセル値に対応し、ＰＭＢ_ＷおよびＰＭＢ_Ｈはそれぞれマクロブロックを単位とするピクチャの幅および高さであり、Ｂ_ＷおよびＢ_Ｈは現在のピクチャ中の各マクロブロックの幅および高さである（通常はＢ_Ｗ＝Ｂ_Ｈ＝１６）。

Here, c [x, y] corresponds to the pixel value at the position (x, y), PMB _W and PMB _H are the width and height of the picture in units of macroblocks, and B _W and B _H Is the width and height of each macroblock in the current picture (usually B _W = B _H = 16).

さらに、ピクチャｍ（例えばｍ＝ｋ＋１）に対する以下の時間的特性も、以下のように計算する。 Further, the following temporal characteristics for the picture m (for example, m = k + 1) are also calculated as follows.

（Ｉ）平均絶対ピクチャ差ＭＡＰＤ_ｋ，ｍを次のように計算する。 (I) The average absolute picture difference MAPD _{k, m} is calculated as follows.

（ＩＩ）平均絶対重み付きピクチャ差ＭＡＷＰＤ_ｋ，ｍを次のように計算する。 (II) The average absolute weighted picture difference MAWPD _{k, m} is calculated as follows.

（ＩＩＩ）平均絶対オフセット・ピクチャ差ＭＡＷＰＤ_ｋ，ｍを次のように計算する。 (III) The average absolute offset / picture difference MAWPD _{k, m} is calculated as follows.

（ＩＶ）二乗平均ピクチャ誤差ＭＳＰＥ_ｋ，ｍを次のように計算する。 (IV) The root mean square picture error MSPE _{k, m} is calculated as follows.

（Ｖ）絶対ピクチャ分散差ＡＰＶＤ_ｋ，ｍを次のように計算する。 (V) The absolute picture variance difference APVD _{k, m} is calculated as follows.

ＡＰＶＤ_ｋ，ｍ＝｜ＰＶ_ｋ−ＰＶ_ｍ｜ APVD _{k, m} = | PV _k -PV _m |

計算可能なその他の時空間的特性は、ヒストグラムの絶対差、絶対差のヒストグラム、ｋとＭの間のχ^２メトリクス、任意（または複数）のエッジ演算子（キャニー・エッジ演算子、ソーベル・エッジ演算子、またはプレウィット・エッジ演算子を含むがこれらに限定されない。）を用いたｋのエッジ、またはシーケンスのインタレース特性を検出するためのフィールド・ベースのメトリクスである。有用である可能性があり、上記から推測可能な他の２つの統計情報は、現在のピクチャの最も近い過去の符号化されたイントラ・ピクチャからの距離（ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ）、および現在のピクチャの最も近い将来の符号化されたイントラ・ピクチャからの距離（ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ）である。これらは例えばピクチャ番号、符号化順序、またはピクチャ・オーダ・カウント（ｐｏｃ）で測定される。これらの統計値は、シーン変更／ショット検出器および／またはデフォルトＧＯＰ（ｇｒｏｕｐｏｆｐｉｃｔｕｒｅｓ）構造を考慮することによって改善することができる。時間的特性は、（例えば本発明をマルチパス態様に適用した場合には）元の画像または再構成された画像を用いて計算することができ、これらのメトリクスの計算では動き推定／動き補償を考慮することもできる。 Other spatio-temporal properties that can be calculated are: absolute difference in histogram, histogram of absolute difference, χ ² metric between k and M, arbitrary (or multiple) edge operators (canny edge operator, Sobel edge, Field-based metrics for detecting interlaced properties of k edges, or sequences, including but not limited to operators, or pre-wit edge operators). Two other statistics that may be useful and can be inferred from the above are the distance from the closest past encoded intra picture of the current picture (last_idistance _k ), and the most of the current picture The distance (next_idistance _k ) from the coded intra picture in the near future. These are measured, for example, by picture number, coding order, or picture order count (poc). These statistics can be improved by considering a scene change / shot detector and / or a default GOP (group of pictures) structure. Temporal characteristics can be calculated using the original image or the reconstructed image (e.g. when the invention is applied to a multipath aspect), and these metrics are calculated using motion estimation / compensation. It can also be considered.

上記のメトリクスに基づいて、エンコーダは、符号化プロセスに関係する特定のピクチャ、マクロブロックまたはサブブロックのパラメータを修正することを決定することができる。これらは、量子化値（ＱＰ）、係数デッド・ゾーン処理／しきい値処理、マクロブロック符号化とフレーム間およびフィールド間のピクチャ・レベル決定のラグランジュ値、デブロッキング・フィルタ・パラメータ、符号化および参照ピクチャの順序付け、シーン／ショット（フェード／ディゾルブ／ワイプ／フラッシュなどがあるがこれらに限定されない）検出、ＧＯＰ構造などのパラメータを含む。 Based on the above metrics, the encoder can decide to modify the parameters of a particular picture, macroblock or subblock related to the encoding process. These include quantized values (QP), coefficient dead zone processing / threshold processing, macroblock encoding and Lagrangian values for picture level determination between frames and fields, deblocking filter parameters, encoding and This includes parameters such as reference picture ordering, scene / shot (including but not limited to fade / dissolve / wipe / flash) detection, GOP structure and the like.

本発明の例示的な一実施形態では、上記パラメータは、以下のように、スライス・タイプｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋのピクチャｋを符号化するときにピクチャ量子化値（ＱＰ）を適合させるものと考える。この実施形態では、ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}は、ピクチャ数を単位とした隣接する２つのピクチャの間の距離と考える。
ｉｆ（ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞３＆＆ｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋ＝＝Ｉ＿Ｓｌｉｃｅ）
｛
ｉｆ（ＰＶ_ｋ＜１＆＆ＭＡＰＤ_{ｋ，ｋ＋１}＜１＆＆ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞５^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}）
ＱＰ_ｋ＝ＱＰ_ｋ−４
ｅｌｓｅｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜３＆＆（ｋ＝＝０｜｜ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞５^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}））
ＱＰ_ｋ＝ＱＰ_ｋ−３
ｅｌｓｅｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜１０）
ＱＰ_ｋ＝ＱＰ_ｋ−２
ｅｌｓｅｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜１５）
ＱＰ_ｋ＝ＱＰ_ｋ−１
｝
ｅｌｓｅｉｆ（ＡＭＶ_ｋ＞１０＆＆ＡＭＶ_ｋ＜６０）
｛
ｉｆ（ＰＶ_ｋ＜５００＆＆ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞３^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}）
｛
ｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜１０＆＆ＡＭＶ_ｋ＜３５＆＆ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞２^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}）
ＱＰ_ｋ＝ＱＰ_ｋ−２
ｅｌｓｅ
ＱＰ_ｋ＝ＱＰ_ｋ−１
｝
ｅｌｓｅｉｆ（ＰＶ_ｋ＜１５００＆＆ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞０）
｛
ｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜２５）
ＱＰ_ｋ＝ＱＰ_ｋ−１
｝
｝
ｅｌｓｅｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＝＝０＆＆ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞３^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}＆＆ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞４^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}）
ＱＰ_ｋ＝ＱＰ_ｋ−２
ｅｌｓｅ（（（ＭＡＰＤ_{ｋ，ｋ＋１}＜２＆＆ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞３^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}＆＆ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞２^＊ｄｉｓｔａｎｃｅ_{ｋ，ｋ＋１}）
｜｜ｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞３０）＆＆ｎｅｘｔ＿ｉｄｉｓｔａｎｃｅ_ｋ＞５）
｛
ｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜１）
ＱＰ_ｋ＝ＱＰ_ｋ−３
ｅｌｓｅｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜４）
ＱＰ_ｋ＝ＱＰ_ｋ−２
ｅｌｓｅｉｆ（ＭＡＰＤ_{ｋ，ｋ＋１}＜１０）
ＱＰ_ｋ＝ＱＰ_ｋ−１
｝ In an exemplary embodiment of the invention, the above parameters are considered to adapt the picture quantization value (QP) when encoding a picture k of slice type cur_slice_type _k as follows: In this embodiment, distance _{k, k + 1} is considered as the distance between two adjacent pictures in units of pictures.
if (next_idistance _k > 3 && cur_slice_type _k == I_Slice)
{
if (PV _k <1 && MAPD _{k, k + 1} <1 && last_idistance _k > 5 ^* distance _{k, k + 1} )
QP _k = QP _k −4
else if (MAPD _{k, k + 1} <3 && (k == 0 || last_idstance _k > 5 ^* distance _{k, k + 1} ))
QP _k = QP _k −3
else if (MAPD _{k, k + 1} <10)
QP _k = QP _k −2
else if (MAPD _{k, k + 1} <15)
QP _k = QP _k −1
}
else if (AMV _k > 10 && AMV _k <60)
{
if (PV _k <500 && next_idistance _k > 3 ^* distance _{k, k + 1} )
{
if (MAPD _{k, k + 1} <10 && AMV _k <35 && last_idistance _k > 2 ^* distance _{k, k + 1} )
QP _k = QP _k −2
else
QP _k = QP _k −1
}
else if (PV _k <1500 && next_idistance _k > 0)
{
if (MAPD _{k, k + 1} <25)
QP _k = QP _k −1
}
}
else if (MAPD _{k, k + 1} == 0 && next_idistance _k > 3 ^* distance _{k, k + 1} && last_idistance _k > 4 ^* distance _{k, k + 1} )
QP _k = QP _k −2
else (((MAPD _{k, k + 1} <2 && next_idistance _k > 3 ^* distance _{k, k + 1} && last_idistance _k > 2 ^* distance _{k, k + 1} )
|| last_idistance _k > 30) && next_idistance _k > 5)
{
if (MAPD _{k, k + 1} <1)
QP _k = QP _k −3
else if (MAPD _{k, k + 1} <4)
QP _k = QP _k −2
else if (MAPD _{k, k + 1} <10)
QP _k = QP _k −1
}

上記の実施形態では、直前または近い過去のピクチャが上記の規則により既にそのＱＰを更新しているかどうかは考慮していない。従って、必要以上にＱＰ値を更新することになる可能性もあり、これはレート−歪み（ＲＤ）性能の点で望ましくないことがある。このために、パラメータｌａｓｔ＿ｉｄｉｓｔａｎｃｅ_ｋを更新して、ピクチャ・タイプにかかわらず最後にＱＰを調節されたピクチャの値と等しくなるようにする。 In the above embodiment, it is not considered whether the previous or near past picture has already updated its QP according to the above rules. Therefore, it is possible to update the QP value more than necessary, which may be undesirable in terms of rate-distortion (RD) performance. For this purpose, the parameter last_idistance _k is updated so that it is equal to the value of the last adjusted QP regardless of the picture type.

同様に、マクロブロック／ブロックの分散、平均およびエッジ統計値を使用して、局所的な符号化パラメータを決定することができる。例えば、位置（ｉ，ｊ）にあるマクロブロックのラグランジュのλを選択するには、以下の規則を考慮すればよい。
ｉｆ（ｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋ！＝Ｂ＿Ｓｌｉｃｅ）
｛
ｉｆ（ｃｏｎｔａｉｎｓ＿ｅｄｇｅｓ（ｋ，ｉ，ｊ）） Similarly, macroblock / block variance, average and edge statistics can be used to determine local coding parameters. For example, in order to select the Lagrange λ of the macroblock at the position (i, j), the following rule may be considered.
if (cur_slice_type _k ! = B_Slice)
{
if (contains_edges (k, i, j))

ｅｌｓｅｉｆ（ｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋ＝＝Ｉ＿Ｓｌｉｃｅ）
｛
ｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜１５｜｜ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞６０）

else if (cur_slice_type _k == I_Slice)
{
if (MBvariance (k, i, j) <15 || MBvariance (k, i, j)> 60)

ｅｌｓｅｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞＝１５＆＆ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜＝４０）

else if (MBvariance (k, i, j)> = 15 && MBvariance (k, i, j) <= 40)

ｅｌｓｅ

else

｝
ｅｌｓｅ／／ｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋ＝＝Ｐ＿Ｓｌｉｃｅ
｛
ｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜１５｜｜ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞６０）

}
else / cur_slice_type _k == P_Slice
{
if (MBvariance (k, i, j) <15 || MBvariance (k, i, j)> 60)

else if (MBvariance (k, i, j)> = 15 && MBvariance (k, i, j) <= 40)

ｅｌｓｅ

else

｝
｝
ｅｌｓｅ
｛
ｂｓｃａｌｅ＝ｍａｘ（２．００，ｍｉｎ（４．００，（ＱＰ／６．０）））；
ｉｆ（ｃｏｎｔａｉｎｓ＿ｅｄｇｅｓ（ｋ，ｉ，ｊ））

}
}
else
{
bscale = max (2.00, min (4.00, (QP / 6.0)));
if (contains_edges (k, i, j))

ｅｌｓｅ
｛
ｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜１５｜｜ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞６０）

else
{
if (MBvariance (k, i, j) <15 || MBvariance (k, i, j)> 60)

else if (MBvariance (k, i, j)> = 15 && MBvariance (k, i, j) <= 40)

ｅｌｓｅ

else

｝
ｉｆ（ｎａｌ＿ｒｅｆｅｒｅｎｃｅ＿ｉｄｃ＝＝１）
λ＝０．８０×λ
｝

}
if (nal_reference_idc == 1)
λ = 0.80 × λ
}

残りの符号化に使用される量子化値または係数しきい値処理の選択でも、同様の決定を行うことができる。具体的には、Ｈ．２６４での係数Ｗの量子化は、以下のように実行される。
Ｚ＝ｉｎｔ（｛｜Ｗ｜＋ｆ×（１＜＜ｑ＿ｂｉｔｓ）｝＞＞ｑｂｉｔｓ）・ｓｇｎ（Ｗ）
ここで、Ｚは最終的な量子化値であり、ｑ＿ｂｉｔｓは現在のマクロブロックの量子化器ＱＰに基づく。ｆ×（１＜＜ｑ＿ｂｉｔｓ）の項は、この量子化プロセスの丸めを行う項として機能し、１／２×（１＜＜ｑ＿ｂｉｔｓ）であると「最適」である。 Similar decisions can be made in selecting the quantization value or coefficient thresholding used for the remaining encodings. Specifically, H.C. The quantization of the coefficient W at H.264 is performed as follows.
Z = int ({| W | + f × (1 << q_bits)} >> qbits) · sgn (W)
Here, Z is the final quantized value, and q_bits is based on the quantizer QP of the current macroblock. The term of f × (1 << q_bits) functions as a term for rounding the quantization process, and is “optimal” when ½ × (1 << q_bits).

次に図２を参照すると、変換および量子化を行う間のデッド・ゾーン処理の影響の全体を参照番号２００で示してある。図２では、ゼロ付近の区間をデッド・ゾーンと呼ぶ。デッド・ゾーン量子化器は、図２に示すように、ゼロビン幅（ｚｅｒｏｂｉｎ−ｗｉｄｔｈ）（２ｓ−２ｆ）およびアウトビン幅（ｏｕｔｂｉｎｗｉｄｔｈ）（ｓ）という２つのパラメータで特徴付けられる。ｆによるデッド・ゾーンの最適化は、良好なレート−歪み性能を得るための効率的な方法として使用されることが多い。しかし、このプロセスの間にデッド・ゾーンを導入する（すなわちｆの項を減少させる。）ことにより、通常は、品質に対する影響を低く抑えながらさらにビットレートを低下させることができることは周知である。これは、解像度の高い素材の細部（およびフィルム・グレイン情報）を含まない解像度の低いコンテンツの場合に特に当てはまる。ｆ＝１／２を用いてもよいが、この値にすると、ビットレートの増加が大きくなり、レート−歪み評価の点で性能が低下する可能性もある。 Referring now to FIG. 2, the overall effect of dead zone processing during transformation and quantization is indicated by reference numeral 200. In FIG. 2, a section near zero is called a dead zone. The dead zone quantizer is characterized by two parameters, zero bin-width (2s-2f) and outbin width (s), as shown in FIG. Dead zone optimization by f is often used as an efficient way to obtain good rate-distortion performance. However, it is well known that by introducing a dead zone during this process (ie, reducing the term of f), it is usually possible to further reduce the bit rate while keeping the quality impact low. This is especially true for low resolution content that does not include high resolution material details (and film grain information). Although f = 1/2 may be used, if this value is used, the increase in the bit rate becomes large, and the performance may be lowered in terms of rate-distortion evaluation.

様々な周波数が他のものよりも重要であることを考慮して、上記に代わる一つの手法では、性能を向上させるためにこの観察を考慮に入れることになる。全ての変換係数に対して一定のｆの値を使用する代わりに、特に各デッド・ゾーンパラメータが周波数位置に基づいて選択される行列手法では、様々な値を考慮する。従って、この場合にはＺは次のように計算することができる。
Ｚ＝ｉｎｔ（｛｜Ｗ｜＋ｆ（ｉ，ｊ）×（１＜＜ｑ＿ｂｉｔｓ）｝＞＞ｑｂｉｔｓ）・ｓｇｎ（Ｗ）
ここで、ｉおよびｊは、ブロック変換係数内の現在の列または行に対応する。アレイｆは、スライスまたはマクロブロックのタイプ、および現在のブロックのテクスチャ特性（分散またはエッジ情報）によって決めることができる。例えば、あるブロックが複数のエッジを含む、またはその分散特性が低い場合には、デッド・ゾーン処理プロセスによるアーチファクトがより目に見えやすくなるので、それらをそれ以上取り込まないようにすることが重要である。一方、空間アクティビティの高いブロックはより多くのアーチファクトをマスクすることができ、品質に有意な影響を及ぼすことなくデッド・ゾーンを増大させることができる。デッド・ゾーンは、現在のブロックが将来のピクチャのブロックに対して有用な情報を提供するかどうかに応じて（すなわち現在のブロック内の任意のピクセルがその他のピクセルの予測に使用されるか使用されないかに応じて）変化させることもできる。 Considering that various frequencies are more important than others, one alternative approach would take this observation into account to improve performance. Instead of using a constant f value for all transform coefficients, various values are taken into account, especially in the matrix approach where each dead zone parameter is selected based on frequency location. Therefore, in this case, Z can be calculated as follows.
Z = int ({| W | + f (i, j) × (1 << q_bits)} >> qbits) · sgn (W)
Here, i and j correspond to the current column or row in the block transform coefficient. The array f can be determined by the type of slice or macroblock and the texture characteristics (distribution or edge information) of the current block. For example, if a block contains multiple edges or has low dispersion characteristics, artifacts from the dead zone processing process are more visible, so it is important not to capture them further. is there. On the other hand, blocks with high spatial activity can mask more artifacts and increase the dead zone without significantly affecting quality. The dead zone depends on whether the current block provides useful information for future blocks of the picture (i.e. whether any pixels in the current block are used to predict other pixels) (Depending on what is not done).

例えば、４×４変換を用いる場合には、以下のデッド・ゾーン処理行列を使用することができる。
ｉｆ（ｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋ＝＝Ｉ＿Ｓｌｉｃｅ）
｛
ｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜１５｜｜ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞６０） For example, when using a 4 × 4 transform, the following dead zone processing matrix can be used.
if (cur_slice_type _k == I_Slice)
{
if (MBvariance (k, i, j) <15 || MBvariance (k, i, j)> 60)

ｅｌｓｅｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞＝１５＆＆ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜＝４０｜｜ｃｏｎｔａｉｎｓ＿ｅｄｇｅｓ（ｋ，ｉ，ｊ））

else if (MBvariance (k, i, j)> = 15 && MBvariance (k, i, j) <= 40 || contains_edges (k, i, j))

ｅｌｓｅ

else

｝
ｅｌｓｅｉｆ（ｃｕｒ＿ｓｌｉｃｅ＿ｔｙｐｅ_ｋ＝＝Ｐ＿Ｓｌｉｃｅ）
｛
ｉｆ（ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＜１５｜｜ＭＢｖａｒｉａｎｃｅ（ｋ，ｉ，ｊ）＞６０）

}
else if (cur_slice_type _k == P_Slice)
{
if (MBvariance (k, i, j) <15 || MBvariance (k, i, j)> 60)

ｅｌｓｅ

else

｝
ｅｌｓｅ／／Ｂ＿ｓｌｉｃｅｓ
｛

}
else / B_slices
{

｝

}

ある条件下では、エンコーダが将来のフレームを用いた時間的解析を実行することが可能になることもある。この場合には、以前に符号化されたピクチャのみを考慮して、将来のピクチャが同様の時間的特性を有するものと仮定することによって時間的解析を実行することができる。例えば、現在のピクチャが高い類似性を有する（例えばＭＡＰＤ_{ｋ，ｋ−１}が小さい）場合には、次に符号化するピクチャとの類似性（ＭＡＰＤ_{ｋ，ｋ＋１}）も小さいと仮定する。従って、全ての指標（ｋ、ｋ＋１）を（ｋ，ｋ−１）で置き換えて、既に利用可能な情報に基づいて符号化パラメータの適応化を行うことができる。 Under certain conditions, the encoder may be able to perform temporal analysis using future frames. In this case, temporal analysis can be performed by considering only previously coded pictures and assuming that future pictures have similar temporal characteristics. For example, when the current picture has a high similarity (for example, MAPD _{k, k−1} is small), it is assumed that the similarity (MAPD _{k, k + 1} ) with the next picture to be encoded is also small. Therefore, all the indices (k, k + 1) can be replaced with (k, k-1), and coding parameters can be adapted based on already available information.

次に図３を参照すると、ビデオ・エンコーダの全体を参照番号３００で示してある。ビデオ・エンコーダ３００の入力は、事前解析ブロック３１０の入力と信号通信で接続されている。事前解析ブロック３１０は、互いに信号通信で接続された複数のフレーム遅延３１２を含む。これら複数のフレーム遅延３１２はそれぞれ縦に直列に接続され、また並列信号経路によって並列にも接続されている。この並列信号経路は、時間的解析器３１５の入力にも信号通信で接続されている。エンコーダ３００の入力から最も遠く離れた、直列に接続された最後のフレーム遅延３１２の出力は、空間的解析器３２０の入力、第１の加算ジャンクション３２５の反転入力、動き補償器３７５の第１の入力、および動き推定／モード決定ブロック３７０の第１の入力に信号通信で接続される。第１の加算ジャンクション３２５の出力は、変換器３３０の入力に信号通信で接続される。変換器３３０の出力は、量子化器３３５の第１の入力に信号通信で接続される。量子化器３３５の出力は、可変長コーダ３４０の第１の入力および逆量子化器３４５の入力に信号通信で接続される。可変長コーダ３４０の出力は、ビデオ・エンコーダ３００の外部で使用可能な出力である。逆量子化器３４５の出力は、逆変換器３５０の入力に信号通信で接続される。逆変換器３５０の出力は、第２の加算ジャンクション３５５の非反転の第１の入力に信号通信で接続される。第２の加算ジャンクション３５５の出力は、ループ・フィルタ３６０の第１の入力に信号通信で接続される。ループ・フィルタ３６０の出力は、ピクチャ参照記憶装置３６５の第１の入力に信号通信で接続される。ピクチャ参照記憶装置３６５の出力は、動き推定／モード決定ブロック３７０の第２の入力、および動き補償器３７５の第２の入力に信号通信で接続される。動き推定／モード決定ブロック３７０の第１の出力は、可変長コーダ３４０の第２の入力に信号通信で接続される。動き推定／モード決定ブロック３７０の第２の出力は、動き補償器３７５の第３の入力に信号通信で接続される。動き補償器３７５の出力は、第１の加算ジャンクション３２５の非反転入力、および第２の加算ジャンクション３５５の非反転の第２の入力に信号通信で接続される。空間的解析器３２０の第１の出力は、量子化器３３５の第２の入力に信号通信で接続される。空間的解析器３２０の第２の出力は、ループ・フィルタ３６０の第２の入力、動き推定／モード決定ブロック３７０の第３の入力、および第１の加算ジャンクション３２５の非反転入力に信号通信で接続される。時間的解析器３１５の第１の出力は、量子化器３３５の第２の入力に信号通信で接続される。時間的解析器３１５の第２の出力は、動き推定／モード決定ブロック３７０の第４の入力に信号通信で接続される。時間的解析器３１５の第３の出力は、ループ・フィルタ３６０の第３の入力、およびピクチャ参照記憶装置３６５の第２の入力に信号通信で接続される。 Referring now to FIG. 3, the entire video encoder is indicated by reference numeral 300. The input of the video encoder 300 is connected to the input of the pre-analysis block 310 by signal communication. The pre-analysis block 310 includes a plurality of frame delays 312 connected in signal communication with each other. The plurality of frame delays 312 are connected in series vertically and are also connected in parallel by a parallel signal path. This parallel signal path is also connected to the input of the temporal analyzer 315 by signal communication. The output of the last serially connected frame delay 312 farthest from the input of the encoder 300 is the input of the spatial analyzer 320, the inverting input of the first summing junction 325, the first of the motion compensator 375. An input and a first input of motion estimation / mode decision block 370 are connected in signal communication. The output of the first summing junction 325 is connected to the input of the converter 330 by signal communication. The output of the converter 330 is connected in signal communication to the first input of the quantizer 335. The output of the quantizer 335 is connected in signal communication to the first input of the variable length coder 340 and the input of the inverse quantizer 345. The output of the variable length coder 340 is an output that can be used outside the video encoder 300. The output of the inverse quantizer 345 is connected to the input of the inverse transformer 350 by signal communication. The output of the inverse converter 350 is connected in signal communication to the non-inverting first input of the second summing junction 355. The output of the second summing junction 355 is connected in signal communication to the first input of the loop filter 360. The output of the loop filter 360 is connected in signal communication to a first input of the picture reference storage device 365. The output of the picture reference store 365 is connected in signal communication to a second input of the motion estimation / mode determination block 370 and a second input of the motion compensator 375. A first output of motion estimation / mode decision block 370 is connected in signal communication to a second input of variable length coder 340. A second output of motion estimation / mode decision block 370 is connected in signal communication to a third input of motion compensator 375. The output of the motion compensator 375 is connected in signal communication to the non-inverting input of the first summing junction 325 and the non-inverting second input of the second summing junction 355. The first output of the spatial analyzer 320 is connected in signal communication to the second input of the quantizer 335. The second output of the spatial analyzer 320 is in signal communication with the second input of the loop filter 360, the third input of the motion estimation / mode decision block 370, and the non-inverting input of the first summing junction 325. Connected. A first output of temporal analyzer 315 is connected in signal communication to a second input of quantizer 335. A second output of temporal analyzer 315 is connected in signal communication to a fourth input of motion estimation / mode determination block 370. A third output of temporal analyzer 315 is connected in signal communication with a third input of loop filter 360 and a second input of picture reference store 365.

ピクチャ群は、時間的解析ステップの間に考慮される。この時間的解析ステップでは、スライス・タイプ決定、ＧＯＰ構造、重み付きパラメータ（動き推定／モード決定ブロック３７０による）、量子化値およびデッド・ゾーン処理（量子化器３３５による）、参照の順序およびハンドリング（ピクチャ参照記憶装置３６５）、ピクチャ符号化の順序付け、フレーム／フィールド・ピクチャ・レベル適応化決定、ならびにデブロッキング・パラメータ（ループ・フィルタ３６０）を含むいくつかのパラメータを決定する。同様に、各符号化フレームに対して空間的解析を行う。この空間的解析も同様に、量子化およびデッド・ゾーン処理（量子化器３３５）、ラグランジュ・パラメータおよびスライス・タイプ決定（動き推定／モード決定ブロック３７０）、インター／イントラ・モード決定、フレーム／フィールド・ピクチャ・レベルおよびマクロブロック・レベル適応化決定、ならびにデブロッキング（ループ・フィルタ３６０）に影響を及ぼす可能性がある。 The pictures are considered during the temporal analysis step. This temporal analysis step includes slice type determination, GOP structure, weighted parameters (by motion estimation / mode determination block 370), quantized values and dead zone processing (by quantizer 335), reference order and handling. Several parameters are determined, including (picture reference store 365), picture coding ordering, frame / field picture level adaptation decisions, and deblocking parameters (loop filter 360). Similarly, a spatial analysis is performed on each encoded frame. This spatial analysis is similarly quantized and dead zone processed (quantizer 335), Lagrange parameter and slice type determination (motion estimation / mode determination block 370), inter / intra mode determination, frame / field. Can affect picture level and macroblock level adaptation decisions, as well as deblocking (loop filter 360).

次に図４を参照すると、ビデオ信号データを符号化する例示的なプロセスの全体を参照番号４００で示してある。このプロセスでは、反復するたびに必要な統計値を収集および更新しながら、同一のビットストリームを複数回解析または符号化することができる。これらの統計値は、後続のパスのそれぞれで、ビデオ特性またはユーザ要件を所与としてエンコーダのパラメータを適応させることによって符号化の性能を向上させるために使用される。具体的には、ｋ個のフレーム（すなわち記憶されていないピクチャは除く。）は、Ｌ個のパス（本明細書ではＬ回の「繰返し」および「反復」とも呼ぶ）およびサイズ（Ｎ，Ｍ）のウィンドウで符号化されることになる。ここで、Ｎはウィンドウ内のフレームの総数、Ｍは隣接するウィンドウ間で重複するフレームの数である。符号化されるフレームは変数ｆｒｍを用いて指標が付けられ、ウィンドウ内の現在の位置は変数ｗ_{ｉｎｄｅｘ}を用いて指標が付けられる。 With reference now to FIG. 4, an exemplary process for encoding video signal data is indicated generally by the reference numeral 400. In this process, the same bitstream can be parsed or encoded multiple times while collecting and updating the necessary statistics for each iteration. These statistics are used in each subsequent pass to improve the encoding performance by adapting the encoder parameters given the video characteristics or user requirements. Specifically, k frames (ie, excluding unstored pictures) have L paths (also referred to herein as L “repeats” and “repeats”) and sizes (N, M ) Window. Here, N is the total number of frames in the window, and M is the number of frames that overlap between adjacent windows. The frame to be encoded is indexed using the variable frm and the current position in the window is indexed using the variable w _index .

このプロセスは、制御を機能ブロック４１０に渡す開始ブロック４０５を含む。機能ブロック４１０では、シーケンス・サイズをｋに設定し、繰返し回数をＬに設定し、変数ｉをゼロ（０）に設定し、制御を機能ブロック４１５に渡す。機能ブロック４１５では、ウィンドウ・サイズをＮに設定し、重複サイズをＭに設定し、変数ｆｒｍをゼロ（０）に設定し、制御を機能ブロック４２０に渡す。機能ブロック４２０では、変数ｗ_{ｉｎｄｅｘ}をゼロ（０）に設定し、制御を機能ブロック４２５に渡す。従って、各符号化パスごとにウィンドウ・パラメータが初期化されることが理解されるであろう。これにより、様々なウィンドウ・サイズを使用すること、または以前の解析ステップに基づいて様々なウィンドウ・サイズの適応化を行うことも可能になる（例えばシーン変更が検出された場合には、それに応じて完全なシーンのみを含むようにＮおよびＭを調整することができる。）。 The process includes a start block 405 that passes control to a function block 410. In function block 410, the sequence size is set to k, the number of iterations is set to L, the variable i is set to zero (0), and control is passed to function block 415. In function block 415, the window size is set to N, the overlap size is set to M, the variable frm is set to zero (0), and control is passed to function block 420. In function block 420, the variable w _index is set to zero (0) and control is passed to function block 425. Thus, it will be appreciated that the window parameters are initialized for each encoding pass. This also makes it possible to use different window sizes or to adapt different window sizes based on previous analysis steps (for example, if a scene change is detected, And N and M can be adjusted to include only complete scenes.)

機能ブロック４２５では、ウィンドウ内のＮ個全てのフレームを考慮しながら、処理対象の各ウィンドウの時間的解析を実行し、時間的統計値（ｔｓｔａｔ_{ｉ，ｆｒｍ…ｆｒｍ＋Ｎ−１}）を生成し、以前のパスまたは符号化ステップの統計値を現在の統計値を用いて最適に適応させ、または洗練する。機能ブロック４２５では、その後、制御を機能ブロック４３０に渡す。機能ブロック４３０では、ｗ_{ｉｎｄｅｘ}＜Ｎ−Ｍの条件が満たされなくなるまで指標ｆｒｍ（現在のウィンドウ内のｗ_{ｉｎｄｅｘ}）を有するフレームについて空間的解析を実行し、制御を機能ブロック４３５に渡す。機能ブロック４３５では、時間的および空間的解析の結果に基づいてこれらのフレームを符号化し、複数のパスが必要な場合に必要になる可能性があるエンコーダ統計値を生成／収集し、制御を機能ブロック４４０に渡す。 The function block 425 performs temporal analysis of each window to be processed while considering all N frames in the window to generate temporal statistics (tstat _{i, frm... Frm + N−1} ). The pass or encoding step statistics are best adapted or refined using the current statistics. In function block 425, control is then passed to function block 430. In function block 430, spatial analysis is performed on the frame with index frm (w _index in the current window) until the condition of w _index <N−M is no longer satisfied, and control is passed to function block 435. Function block 435 encodes these frames based on the results of temporal and spatial analysis, generates / collects encoder statistics that may be needed if multiple passes are required, and controls Pass to block 440.

機能ブロック４４０では、変数ｆｒｍおよびｗ_{ｉｎｄｅｘ}の値を増分し、制御を判定ブロック４４５に渡す。判定ブロック４４５では、変数ｆｒｍがｋ未満であるかどうかを判定する。 In function block 440, the values of the variables frm and w _index are incremented and control is passed to decision block 445. At decision block 445, it is determined whether the variable frm is less than k.

変数ｆｒｍがｋ未満である場合には、ｗ_{ｉｎｄｅｘ}が（Ｎ−Ｍ）未満であるかどうかを判定する判定ブロック４５０に制御が移る。そうでない場合には、変数ｆｒｍがｋ以上である場合には、ｉがＬ未満であるかどうかを判定する判定ブロック４５５に制御が移る。 If the variable frm is less than k, control passes to a decision block 450 that determines whether w _index is less than (N−M). Otherwise, if the variable frm is greater than or equal to k, control passes to a decision block 455 that determines whether i is less than L.

ｗ_{ｉｎｄｅｘ}が（Ｎ−Ｍ）未満である場合には、制御は機能ブロック４３０に移る。そうでない場合、ｗ_{ｉｎｄｅｘ}が（Ｎ−Ｍ）以上である場合には、制御は機能ブロック４２０に戻る。 If w _index is less than (N−M), control passes to function block 430. Otherwise, if w _index is equal to or greater than (N−M), control returns to function block 420.

ｉがＬ以上である場合には、制御は機能ブロック４１５に戻る。そうでない場合、ｉがＬ未満である場合には、制御は終了ブロック４６０に移る。 If i is greater than or equal to L, control returns to function block 415. Otherwise, if i is less than L, control passes to end block 460.

以下、本発明の様々な例示的な実施形態により、本発明の多くのさらなる利点／特徴のいくつかについて説明する。例えば、１つの利点／特徴は、符号化するコンテンツの重複する限られたウィンドウに基づいてビデオ解析を実行し、この情報を用いて符号化パラメータの適応化を行う符号化装置および符号化方法が提供されることである。もう１つの利点／特徴は、ビデオ解析において時空間的解析を使用することである。さらに別の利点／特徴は、ビデオ解析で事前符号化パスを考慮することである。さらに、別の利点／特徴は、ビデオ解析において時空間的解析および事前符号化パスが共に考慮されることである。また、別の利点／特徴は、ピクチャ符号化タイプ、エッジ、平均および分散の情報のうち少なくとも１つが、空間的解析、ならびにラグランジュ・パラメータ、量子化およびデッド・ゾーン処理の適応化に使用されることである。さらに別の利点／特徴は、絶対差および絶対分散を使用して、量子化パラメータの適応化を行うことである。さらに、別の利点／特徴は、実行されたビデオ解析で、以前に符号化されたピクチャのみが考慮されることである。さらに、別の利点／特徴は、実行されたビデオ解析を利用して、スライス・タイプ決定、ＧＯＰおよびピクチャ符号化の構造および順序、重み付きパラメータ、量子化値およびデッド・ゾーン処理、ラグランジュ・パラメータ、参照数、参照の順序およびハンドリング、フレーム／フィールド・ピクチャおよびマクロブロックの決定、デブロッキング・パラメータ、インター・ブロック・サイズ決定、イントラ空間予測、および直接モードを含む（ただしこれらに限定されない）いくつかの符号化パラメータのうち少なくとも１つを決定することである。また、別の利点／特徴は、以前に生成した統計値が符号化パラメータまたは解析統計値に適応するように考慮しながら、複数回反復してビデオ解析を実行することができることである。さらに、別の利点／特徴は、ウィンドウ・サイズおよび重複するウィンドウ領域が、以前に生成した解析統計値に基づいて適応化可能であることである。 In the following, some of the many further advantages / features of the present invention will be described by means of various exemplary embodiments of the present invention. For example, one advantage / feature is that an encoding apparatus and method that performs video analysis based on overlapping overlapping windows of content to be encoded and uses this information to adapt encoding parameters. Is to be provided. Another advantage / feature is the use of spatio-temporal analysis in video analysis. Yet another advantage / feature is to consider the precoding pass in video analysis. Yet another advantage / feature is that both spatio-temporal analysis and precoding passes are considered in video analysis. Another advantage / feature is that at least one of picture coding type, edge, mean and variance information is used for spatial analysis and adaptation of Lagrangian parameters, quantization and dead zone processing That is. Yet another advantage / feature is the use of absolute difference and absolute variance to perform quantization parameter adaptation. Yet another advantage / feature is that only the previously encoded pictures are considered in the performed video analysis. Furthermore, another advantage / feature is the use of the video analysis performed to determine slice type, GOP and picture coding structure and order, weighted parameters, quantized values and dead zone processing, Lagrangian parameters , Reference number, reference order and handling, frame / field picture and macroblock determination, deblocking parameters, inter block sizing, intra spatial prediction, and direct mode Determining at least one of the encoding parameters. Another advantage / feature is that video analysis can be performed in multiple iterations, taking into account previously generated statistics to adapt to encoding parameters or analysis statistics. Furthermore, another advantage / feature is that the window size and overlapping window regions can be adapted based on previously generated analysis statistics.

本発明の以上その他の特徴および利点は、当業者なら本明細書の教示に基づいて容易に確認することができる。本発明の教示は、ハードウェア、ソフトウェア、ファームウェア、特殊目的プロセッサまたはそれらの組合せの様々な形態で実施することができることを理解されたい。 These and other features and advantages of the present invention can be readily ascertained by those skilled in the art based on the teachings herein. It should be understood that the teachings of the present invention can be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof.

本発明の教示は、ハードウェアとソフトウェアの組合せとして実施されることがより好ましい。さらに、ソフトウェアは、プログラム記憶装置上で有形に実施されたアプリケーション・プログラムとして実施されることが好ましい。アプリケーション・プログラムは、任意の適当なアーキテクチャを有するマシンにアップロードして実行することができる。このマシンは、１つまたは複数の中央処理装置（「ＣＰＵ」）、ランダム・アクセス・メモリ（「ＲＡＭ」）および入出力（「Ｉ／Ｏ」）インタフェースなどのハードウェアを有するコンピュータ・プラットフォームに実装されることが好ましい。コンピュータ・プラットフォームは、オペレーティング・システムおよびマイクロ命令コードを含むこともできる。本明細書に記載の様々なプロセスおよび機能は、ＣＰＵが実行することができる、マイクロ命令コードの一部またはアプリケーション・プログラムの一部あるいはそれらの任意の組合せとすることもできる。さらに、追加のデータ記憶装置や印刷装置など、その他の様々な周辺機器をコンピュータ・プラットフォームに接続することもできる。 More preferably, the teachings of the present invention are implemented as a combination of hardware and software. Furthermore, the software is preferably implemented as an application program tangibly implemented on the program storage device. The application program can be uploaded and executed on a machine having any suitable architecture. The machine is implemented on a computer platform having hardware such as one or more central processing units (“CPU”), random access memory (“RAM”) and input / output (“I / O”) interfaces. It is preferred that The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be part of the microinstruction code or part of the application program or any combination thereof that can be executed by the CPU. In addition, various other peripheral devices may be connected to the computer platform such as an additional data storage device and a printing device.

さらに、添付の図面に示すシステム構成要素および方法の一部はソフトウェアで実施することが好ましく、またシステム構成要素間またはプロセス機能ブロック間の実際の接続形態は、本発明を実施する方法によって変わることがあることを理解されたい。本明細書の教示があれば、当業者なら、上記の、またそれに類する本発明の実施態様または構成を思いつくことができるであろう。 Furthermore, some of the system components and methods shown in the accompanying drawings are preferably implemented in software, and the actual connections between system components or process functional blocks will vary depending on the method of implementing the invention. Please understand that there is. Given the teachings herein, one of ordinary skill in the related art will be able to contemplate these and similar embodiments or configurations of the present invention.

本明細書では添付の図面を参照しながら例示的な実施形態について説明したが、本発明はこうした具体的な実施形態に限定されるものではなく、当業者なら、本発明の趣旨または範囲を逸脱することなく様々な変更および修正を加えることができることを理解されたい。これらの変更および修正は全て、添付の特許請求の範囲に記載の本発明の範囲に含まれるものとする。 Although exemplary embodiments have been described herein with reference to the accompanying drawings, the present invention is not limited to these specific embodiments, and those skilled in the art will depart from the spirit or scope of the present invention. It should be understood that various changes and modifications can be made without doing so. All such changes and modifications are intended to be included within the scope of the present invention as set forth in the appended claims.

本発明の原理による例示的なウィンドウ・ベース・２パス符号化アーキテクチャを示すブロック図である。FIG. 2 is a block diagram illustrating an exemplary window-based two-pass encoding architecture according to the principles of the present invention. 本発明の原理による変換および量子化を行う間のデッド・ゾーン処理の影響をプロットした図である。FIG. 6 is a plot of the effects of dead zone processing during conversion and quantization according to the principles of the present invention. 本発明の原理によるエンコーダを示すブロック図である。1 is a block diagram illustrating an encoder according to the principles of the present invention. 本発明の原理による例示的な符号化プロセスを示す流れ図である。5 is a flow diagram illustrating an exemplary encoding process according to the principles of the present invention.

Claims

An encoder that encodes video signal data corresponding to a plurality of pictures,
Performing video analysis of the video signal data using a plurality of overlap analysis windows for at least some of the plurality of pictures corresponding to the video signal data, and encoding the video signal data based on a result of the video analysis The encoder comprising an overlapping window analysis unit (310) that performs parameter adaptation.

The encoder of claim 1, wherein the overlapping window analysis unit (310) performs the video analysis of the video signal data using a spatiotemporal analysis.

The overlapping window analysis unit (310) may perform at least one of picture coding type information, edge information, average information and variance information on spatio-temporal analysis, adaptation of Lagrangian parameters and quantization parameters, and dead The encoder according to claim 2, used for at least one of zone processing.

The encoder according to claim 3, wherein the overlap window analysis unit (310) adapts the quantization parameter using absolute difference and variance.

The encoder of claim 1, wherein the overlap window analysis unit (310) performs the video analysis of the video signal data using a pre-encoding pass.

The encoder of claim 1, wherein the overlap window analysis unit (310) performs the video analysis of the video signal data using both a spatio-temporal analysis and a pre-encoding pass.

The overlapping window analysis unit (310) may perform at least one of picture coding type information, edge information, average information and variance information on spatio-temporal analysis, adaptation of Lagrangian parameters and quantization parameters, and dead The encoder according to claim 6, which is used for at least one of zone processing.

The encoder according to claim 7, wherein the overlap window analysis unit (310) performs quantization parameter adaptation using absolute difference and variance.

The video signal data includes a plurality of frames, each of the plurality of frames representing a corresponding picture, and the duplicate analysis unit performs the video analysis so as to consider only previously encoded pictures. The encoder according to item 1.

The coding parameters include slice type, picture and GOP coding structure and order, weighted parameters, quantized values and dead zone processing, Lagrange parameters, reference number, reference order and handling, frame / field The encoder of claim 1, comprising at least one of picture and macroblock parameters, deblocking parameters, inter block size, intra spatial prediction, and direct mode.

The overlapping window analysis unit (310) performs the video analysis in a plurality of iterations and adapts one of the encoding parameters and analysis statistics based on previously generated analysis statistics. The encoder according to 1.

Each of the overlapping windows has a window size of P pictures and an associated overlapping size, and the overlapping window analysis unit is configured to determine the window size and the window size based on previously generated analysis statistics. The encoder according to claim 1, wherein the overlap size is adapted.

A method for encoding video signal data corresponding to a plurality of pictures, comprising:
Performing video analysis of the video signal data using a plurality of overlap analysis windows for at least some of the plurality of pictures corresponding to the video signal data (425, 430);
Adapting coding parameters of the video signal data based on the result of the video analysis (435).

The method of claim 13, wherein the performing step performs the video analysis of the video signal data using spatio-temporal analysis.

In the execution step and the adaptation step, respectively, at least one of picture coding type information, edge information, average information and variance information is applied to spatiotemporal analysis, adaptation of Lagrangian parameters and quantization parameters, and The method of claim 14 for use in at least one of dead zone processing.

The method of claim 15, wherein the quantization parameter adaptation is performed using absolute difference and variance.

The method of claim 13, wherein the performing step performs the video analysis of the video signal data using a pre-encoding pass.

14. The method of claim 13, wherein the performing step performs the video analysis of the video signal data using both a spatiotemporal analysis and a pre-encoding pass.

The executing step and the adapting step each of at least one of picture coding type information, edge information, average information and variance information, spatio-temporal analysis, adaptation of Lagrangian parameters and quantization parameters, and The method of claim 18 for use in at least one of dead zone processing.

The method of claim 19, wherein the quantization parameter adaptation is performed using absolute difference and variance.

The video signal data includes a plurality of frames, each of the plurality of frames representing a corresponding picture, and the performing step performs the video analysis to consider only previously encoded pictures. 14. The method according to 13.

The coding parameters include slice type, picture and GOP coding structure and order, weighted parameters, quantized values and dead zone processing, Lagrange parameters, reference number, reference order and handling, frame / field 14. The method of claim 13, comprising at least one of picture and macroblock parameters, deblocking parameters, inter block size, intra spatial prediction, and direct mode.

Performing the video analysis in a plurality of iterations in the execution step, and adapting one of the encoding parameters and the analysis statistics based on the previously generated analysis statistics in the adaptation step. Item 14. The method according to Item 13.

Each of the overlapping windows has a window size and an associated overlapping size, and the executing step adapts the window size and the overlapping size based on previously generated analysis statistics 14. The method of claim 13, comprising: