JP5049386B2

JP5049386B2 - Moving picture encoding apparatus and moving picture decoding apparatus

Info

Publication number: JP5049386B2
Application number: JP2010509122A
Authority: JP
Inventors: 友子青野; 典男伊藤
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2008-04-21
Filing date: 2009-03-27
Publication date: 2012-10-17
Anticipated expiration: 2029-03-27
Also published as: WO2009130971A1; JPWO2009130971A1

Description

【技術分野】
【０００１】
本発明は、装置の性能に応じた符号化・復号処理が可能な動画像符号化装置および動画像復号装置に関する。
【背景技術】
【０００２】
動画像符号化技術の最新の国際標準であるＭＰＥＧ−４ＡＶＣ／Ｈ.２６４（以降、Ｈ.２６４と略記する）は、イントラ予測技術およびインター予測技術の改良によって、従来の符号化技術に対して予測性能が向上している。
【０００３】
例えば、イントラ予測技術では、従来、ＤＣＴ（Discrete Cosine Transform）後の変換係数に対して予測を行っており、予測方法もＤＣ予測、水平／垂直方向のＡＣ予測の３種類であった。
しかし、Ｈ.２６４では、ＤＣＴ前の画素値に対して予測を行っており、予測方法は４×４ブロック単位で９種類、８×８ブロック単位で９種類、１６×１６ブロック単位で４種類と多くの予測モードから最適な予測モードを選択できるため、予測精度が高まっている。
【０００４】
また、インター予測技術では、動き補償は従来１６×１６ブロックサイズおよび８×８ブロックサイズのみであった。
しかし、Ｈ.２６４では、１６×１６、１６×８、８×１６、８×８、８×４、４×８、４×４と７種類に増えており、予測精度は高まっている。
【０００５】
このように予測精度が高くなったため、従来と同程度の画質を実現する場合でも、予測誤差が小さくなり、符号化に必要な符号量が削減されることになったが、一方では、量子化幅が粗い場合、従来とは異なる歪が現れるという問題がでてきた。
【０００６】
例えば、パンのようなカメラワークによって画像全体が動く場合、あるいは画像の一部だけが動く場合においても、従来はインター予測の性能がイントラ予測の性能を上回っており、動領域に対してはほとんどインター予測が選択されていた。
しかし、Ｈ．２６４では、同一画面内での予測精度が向上したため、テクスチャの乏しい平坦な領域では動領域であってもイントラ予測が選択されるようになった。
【０００７】
このように、イントラ予測を選択したブロックとインター予測を選択したブロックが混在すると、テクスチャの細かさが異なるため、見た目の印象が悪くなるという問題、また、インター予測の代わりにイントラ予測が選択された場合、画像のテクスチャを再現しにくいという問題がでてきた。
【０００８】
さらに、従来、コンテンツを符号化する際には、ランダムアクセスや高速再生のために、周期的にキーフレーム（あるいはフィールド、以降ではまとめてフレームと呼ぶ）を挿入していた。
Ｈ．２６４では、このキーフレームとして、ＩＤＲ(Instantaneous Decoder Refresh)ピクチャを用い、ＩＤＲピクチャより後に復号されるピクチャがＩＤＲピクチャより前に復号されたピクチャを参照することを禁止するクローズドＧＯＰ(Group Of Pictures)の構造をとることが多い。
【０００９】
この場合、複雑度の高いテクスチャを持つ静止シーンやゆっくり動くパン等のシーンでは、ノイズの影響を受けて、ループ内フィルタの強度の違いや選択されるイントラ予測モードの違いがＩＤＲピクチャ毎に発生し、ＩＤＲピクチャ挿入間隔（ＧＯＰ）単位のフリッカというユーザに認識される歪として現れるようになった。
【００１０】
上記の問題を解決するために、非特許文献１では、カメラワークで画面全体が動くシーンにおいては、ＰおよびＢピクチャの予測モード決定時に、画面全体の動きや符号化対象ブロック近傍のイントラ予測ブロックの割合を考慮してイントラ／インター予測モードの選択方法を提案している。
【００１１】
また、特許文献１では、ＧＯＰ単位のフリッカが認識されやすいと判定された場合は、クローズドＧＯＰではなくオープンＧＯＰを使って符号化する方法や、イントラ／インター予測モードの選択方法を変更して、時間方向の画像の変化が突然現れて歪（フリッカ）として認識されることを回避する方法を提案している。
【００１２】
また、特許文献２では、符号化データの情報をもとに、復号装置側で時間および空間方向のフィルタリング強度を調整し、フリッカが発生しやすいと判定された部分のテクスチャをぼかすことで歪（フリッカ）を抑制する方法を提案している。
【非特許文献１】
吉野知伸、外２名、「Ｈ.２６４／ＭＰＥＧ−４ＡＶＣ符号化における主観画質向上に関する一検討」、２００６年度映像情報メディア学会冬季大会予稿集、映像情報メディア学会、２００６年１２月、Ｎｏ．１−１
【特許文献１】
特開２００７−２１４７８５号公報
【特許文献２】
特開２００６−２２９４１１号公報
【発明の開示】
【発明が解決しようとする課題】
【００１３】
しかしながら、非特許文献１では、次のような問題点がある。
・画面全体の動きの計算が新たに必要となり処理量が増大する。
・コスト計算の結果、もともとイントラ予測を選択した方が小さい符号量になると判定されたブロックをインター予測に変更してしまうため、符号量が増加する場合がでてきてしまう。
・画面全体が動くシーンが対象であるため、画面の一部のみが動領域であるシーンに対しては適応できない。
【００１４】
また、特許文献１では、次のような問題点がある。
・オープンＧＯＰに変更した場合、正常に復号できないピクチャが発生するため、この正常に復号できなかったピクチャを表示しないようにする等の、追加の処理が必要になる。
・ワンセグのようなオープンＧＯＰに対応できないアプリケーションには適用することができない。
【００１５】
また、特許文献２では、フィルタの強度を部分的に変更するため、誤判定した場合にかえってフリッカが目立ったり、フィルタ強度の調整が難しい等の問題がある。
【００１６】
本発明は、上述の実情を考慮してなされたものであって、標準規格ＭＰＥＧ−４ＡＶＣ／Ｈ.２６４と互換性を保ちながら、動領域に対する符号化歪の抑制と再生画像の高画質化の処理を追加できる動画像符号化装置および動画像復号装置を提供することを目的とする。
【課題を解決するための手段】
【００１７】
上記の課題を解決するために、本発明を次のような構成とする。
動画像符号化装置は、動画像を小ブロックに分割し、当該ブロック毎に予測モードとしてイントラ予測モードまたはインター予測モードのいずれかを選択して符号化データを出力する動画像符号化装置であって、カメラの撮影パラメータに基づいてカメラの動きに対応する画像の動き量（動きベクトルに相当）を算出する動き量算出部と、前記ブロック毎にインター予測モードで用いた動きベクトルと前記動き量とを比較する比較部と、前記比較結果に応じ前記動き量と該当するブロックの位置情報とを符号化して、前記符号化データとは別に格納して付加する付加データ符号化部と、を備え、符号化データのみを復号した場合に較べ、付加データの復号処理を加えることで再生画像を高画質化したものである。
［００１８］
上記の動画像符号化装置に対しては、動画像を小ブロックに分割し、当該ブロック毎に予測モードとしてイントラ予測モードまたはインター予測モードのいずれかを選択して符号化された符号化データを復号処理するとともに、前記ブロック毎にインター予測モードで用いた動きベクトルと、カメラの動きに対応する画像の動き量との比較結果に応じ前記動き量と該当するブロックの位置情報とを符号化して前記符号化データとは別に格納して付加された付加データを取得して復号処理する動画像復号装置であって、前記符号化データとは別に格納して付加された付加データを復号処理して、動き量とブロックの位置情報を取得する付加データ復号部と、前記動き量から動き補償ブロックを生成する動き補償部と、前記位置情報のブロックに該当する前記符号化データの復号処理されたブロックの画像と前記動き補償部で生成された動き補償ブロックの画像とを合成する合成部と、を備えるものである。
［００１９］
さらに、他の構成として、動画像符号化装置は、動画像を小ブロックに分割し、当該ブロック毎に予測モードとしてイントラ予測モードまたはインター予測モードのいずれかを選択して符号化データを出力する動画像符号化装置であって、キーフレームについて、カメラの撮影パラメータに基づいてカメラの動きに対応する画像の動き量を算出する動き量算出部と、前記動き量が所定値以下の場合に、前記ブロック毎にインター予測モードで用いた動きベクトルと前記動き量とを比較する比較部と、前記比較結果に応じ前記動き量と該当するブロックの位置情報とを符号化して、前記符号化データとは別に格納して付加する付加データ符号化部と、を備えるものである。
［００２０］
上記の他の構成の動画像符号化装置に対しては、動画像を小ブロックに分割し、当該ブロック毎に予測モードとしてイントラ予測モードまたはインター予測モードのいずれかを選択して符号化された符号化データを復号処理するとともに、キーフレームについて、カメラの動きに対応する画像の動き量が所定値以下の場合に、前記ブロック毎にインター予測モードで用いた動きベクトルと、前記動き量との比較結果に応じ前記動き量と該当するブロックの位置情報とを符号化して前記符号化データとは別に格納して付加された付加データを取得して復号処理する動画像復号装置であって、キーフレームについて、前記符号化データとは別に格納して付加された付加データを復号処理して、動き量とブロックの位置情報を取得する付加データ復号部と、前記動き量から動き補償ブロックを生成する動き補償部と、前記位置情報のブロックに該当する前記符号化データの復号処理されたブロックの画像と前記動き補償部で生成された動き補償ブロックの画像とを合成する合成部と、を備えるものである。
発明の効果
［００２１］
本発明により、標準規格ＭＰＥＧ−４ＡＶＣ／Ｈ．２６４と互換性を保ちながら、動領域に対する符号化歪の抑制と再生画像の高画質化の処理を追加できる。
【図面の簡単な説明】
［００２２］
［図１］実施形態１に係る動画像符号化装置の構成を示すブロック図である。
［図２］動き量または動きベクトルを必要とするブロックの位置情報を説明する図である。
［図３］実施形態１に係る動画像符号化装置の付加データの処理手順を説明するフローチャートである。
［図４］実施形態１に係る動画像復号装置の構成を示すブロック図である。
［図５］実施形態１に係る動画像復号装置のポスト処理部の構成を示すブロック図である。
［図６］実施形態２に係る動画像符号化装置の付加データの処理手順を説明するフローチャートである。
［図７］実施形態３に係る動画像符号化装置の構成を示すブロック図である。
［図８］実施形態３に係る動画像符号化装置の付加データの処理手順を説明するフローチャートである。
符号の説明
［００２３］
１０１…直交変換部、１０２…量子化部、１０３…可変長符号化部、１０４…レート制御部、１０５…逆量子化部、１０６…逆直交変換部、１０７…加算部、１０８…ループ内フィルタ、１０９…フレームメモリ、１１０…イントラ予測部、１１１…動き検出部、１１２…動き補償部、１１３…予測モード決定部、１１４…減算部、１１５…画素数算出部（動き量算出部）、１１６…比較部、１１７…付加データ符号化部、１１８…符号化データ再構成部、
４０１…入力バッファ、４０２…可変長復号部、４０３…逆量子化部、４０４…逆直交変換部、４０５…加算部、４０６…ループ内フィルタ、４０７…フレームメモリ、４０８…イントラ予測部、４０９…動き補償部、４１０…切替スイッチ、４１１…付加データ復号部、４１２…ポスト処理部、５０１…合成部、５０２…動き補償部、５０３…位置情報判定部。
【発明を実施するための最良の形態】
【００２４】
以下、図面を参照して、本発明に係る好適な実施形態について説明する。
＜実施形態１＞
図１は、実施形態１に係る動画像符号化装置の構成を示すブロック図である。図１において、動画像符号化装置は、直交変換部１０１、量子化部１０２、可変長符号化部１０３、レート制御部１０４、逆量子化部１０５、逆直交変換部１０６、加算部１０７、ループ内フィルタ１０８、フレームメモリ１０９、イントラ予測部１１０、動き検出部１１１、動き補償部１１２、予測モード決定部１１３、減算部１１４、画素数算出部（動き量算出部）１１５、比較部１１６、付加データ符号化部１１７、符号化データ再構成部１１８とから構成される。
【００２５】
以上の構成のうち、直交変換部１０１、量子化部１０２、可変長符号化部１０３、レート制御部１０４、逆量子化部１０５、逆直交変換部１０６、加算部１０７、ループ内フィルタ１０８、フレームメモリ１０９、イントラ予測部１１０、動き検出部１１１、動き補償部１１２、予測モード決定部１１３、減算部１１４は、Ｈ.２６４準拠の動画像符号化装置では公知の構成であるので簡単に説明する。
【００２６】
Ｈ.２６４準拠の符号化方法では、動画像ストリームを構成するフレームをマクロブロック単位に分割し、マクロブロックごとに符号化を実行する。そして、マクロブロックごとにイントラ予測モード、インター予測モードおよびそれら以外のいずれかを判断して符号化が実行される。
【００２７】
イントラ予測モードにおいては、符号化対象ブロックの予測ブロックが生成され、符号化対象ブロックと予測ブロックから計算された予測誤差ブロックが直交変換、量子化および可変長符号化によって符号化される。
また、インター予測モードにおいては、符号化対象ブロックと、これに対応する参照フレーム上の領域の位置関係を動きベクトルとして検出し、符号化対象フレームから動きベクトル分変位した参照フレーム上の領域を予測ブロックとし、符号化対象ブロックと予測ブロックから計算された予測誤差ブロックが、直交変換、量子化および可変長符号化によって符号化される。
【００２８】
以下、各構成要素について説明する。
減算部１１４は、入力画像の符号化対象ブロックと予測モード決定部１１３から入力した予測ブロックとの予測誤差ブロックを計算し、直交変換部１０１に出力する。
直交変換部１０１は、減算部１１４から入力した予測誤差ブロックに離散コサイン変換などの直交変換を施して生成した変換係数を量子化部１０２に出力する。
【００２９】
量子化部１０２は、レート制御部１０４から入力した量子化パラメータを基に、当該量子化パラメータに応じて規定される量子化スケールで、直交変換部１０１から出力された変換係数を量子化して可変長符号化部１０３および逆量子化部１０５に出力する。
レート制御部１０４は、可変長符号化部１０３で発生する符号量を監視し、目標符号量に合わせるための量子化パラメータを決定し、これを量子化部１０２に出力する。
【００３０】
可変長符号化部１０３は、量子化部１０２から入力した量子化された変換係数を可変長符号化して図示しないバッファに格納する。このとき、可変長符号化部１０３は、予測モード決定部１１３で決定した予測モードと、予測モードがインター予測の場合には、動き補償部１１２から入力した動きベクトルを符号化して各マクロブロックのヘッダデータに格納する。
また、可変長符号化部１０３は、各マクロブロックに、量子化部１０２における量子化で用いた量子化スケールを含める。
【００３１】
逆量子化部１０５は、量子化部１０２で用いた量子化スケールを基に、量子化された変換係数を逆量子化して逆直交変換部１０６に出力する。
逆直交変換部１０６は、逆量子化部１０５から入力した逆量子化された変換係数に、直交変換部１０１の直交変換に対応した逆直交変換を施して加算部１０７に出力する。
【００３２】
加算部１０７は、予測モード決定部１１３で決定した予測ブロックと、逆直交変換部１０６から入力した逆変換された予測誤差とを加算して、ループ内フィルタ１０８に出力する。
ループ内フィルタ１０８は、加算部１０７から入力した局所復号された画像からブロック歪みを除去し、参照フレームとしてフレームメモリ１０９に書き込む。
【００３３】
イントラ予測部１１０は、予め規定された複数のイントラ予測モードのそれぞれに対して、同一画像内の局所復号された画素値から各マクロブロックについてイントラ予測を行い、最適な予測モードを決定し、その場合の予測モードおよび予測ブロックを予測モード決定部１１３に出力する。
【００３４】
動き検出部１１１は、入力画像の符号化対象ブロックの動きベクトルを検出し、動き補償部１１２、可変長符号化部１０３および比較部１１６に出力する。
この動き検出部１１１は、フレームメモリ１０９に格納されている参照フレームの所定の探索範囲において、現ブロックに類似する部分を探索して、現ブロックからの空間的移動量を動きベクトルとして検出する。
【００３５】
動き補償部１１２は、動き検出部１１１から入力される動きベクトルとフレームメモリ１０９に格納されている参照フレームとを用いて予測ブロックを生成して、予測モードおよび予測ブロックを予測モード決定部１１３に出力する。
【００３６】
予測モード決定部１１３は、イントラ予測部１１０から入力した予測ブロックと、動き補償部１１２から入力した予測ブロックを各々符号化対象ブロックと比較し、予測誤差が小さいと判断した予測ブロックを選択して減算部１１４に出力するとともに、選択されたイントラあるいはインター予測モードを可変長符号化部１０３へ出力する。
【００３７】
次に、本実施形態１の動画像符号化装置の特有な構成である、画素数算出部１１５、比較部１１６、付加データ符号化部１１７および符号化データ再構成部１１８について詳細に説明する。
【００３８】
カメラの動き量（画素数換算）Ｄｓは以下の式で算出できる。ここで、Ｄｒは、撮影時の焦点距離、θは、カメラの水平あるいは垂直方向における撮影方向（レンズの向き）の角度、αは撮影した画像を所定の画像サイズで表現するための比率である。
Ｄｓ＝Ｄｒ × ｔａｎθ × α
【００３９】
カメラの動き量Ｄｓは、撮影対象自体が静止しておりカメラのパンだけの場合、動き検出部１１１で検出される動きベクトルＭＶとほぼ同じ値になる。
画素数算出部１１５は、上述のようにカメラの撮影パラメータからカメラの水平方向および垂直方向の動き量Ｄｓを画素数で算出し、比較部１１６に出力する。
【００４０】
比較部１１６は、カメラの動き量Ｄｓと動き検出部１１１で検出した動きベクトルＭＶの差分を計算し、この差分が予め定められた閾値ＴＨ_ＭＶ以下の場合、符号化対象ブロックはカメラワークと似た動きをすると解釈し、符号化対象ブロックの位置情報と動き量Ｄｓを付加データ符号化部１１７に出力する。また、差分が予め定められた閾値ＴＨ_ＭＶより大きい場合には何も出力しない。
【００４１】
付加データ符号化部１１７は、比較部１１６から出力された位置情報とカメラの動き量Ｄｓを符号化して、符号化データ再構成部１１８に出力する。
カメラの動き量Ｄｓは、画像１枚に対して１ベクトルなので、画像毎に１つ符号化すればよい。
また、動き量Ｄｓを必要とするブロックの位置情報の符号化方法としては、図２のように、各ブロックの位置を平面の１ビットに対応させて、必要なブロックを１、不必要なブロックを０として、これをランレングス符号化する方法等がある。
【００４２】
符号化データ再構成部１１８は、可変長符号化部１０３で作成された符号化データが格納されているバッファに、付加データ符号化部１１７で符号化された付加データを追加してその符号化データを再構成する。
付加データをバッファに格納する方法としては、次のような方法がある。
【００４３】
Ａ）動画像の符号化データとは別に格納する場合：
符号化データは、音声、テキストデータ等と多重化する際にパケットヘッダとパケットデータからなるパケットとして表わされる。例えば、多重化方式としてＭＰＥＧ−２Ｓｙｓｔｅｍを使った場合、このパケットヘッダには、パケットデータに格納しているデータの識別子（stream_id）と、再生出力時刻の管理情報（PTS：Presentation Time Stamp)、復号時間の管理情報（DTS：Decoding Time Stamp)等が格納される。
また、パケットデータには、stream_idで示される識別子によって示されるデータが格納される。
【００４４】
付加データを格納する場合、パケットヘッダのstream_idフィールドに、private_stream_1（２進数で１０１１１１０１）、あるいは、metadata_stream（２進数で１１１１１１００）を設定し、付加データ符号化部１１７で符号化された付加データをパケットデータとして格納する。
さらに、付加データが同期する符号化データの再生出力時刻や復号時間をそれぞれパケットヘッダのＰＴＳおよびＤＴＳに格納する。
【００４５】
Ｂ）動画像符号化データの中に格納する場合：
Ｈ.２６４では、通常の画像データ(主ピクチャ)の符号化データ以外に冗長ピクチャを格納することができる。
この場合には、比較部１１６から出力された位置情報と動き量ＤｓをＨ.２６４の符号化データの形式に変換して、冗長ピクチャに格納する。
【００４６】
次に、付加データの処理手順を図３のフローチャートを用いて説明する。この図３は、カメラの動き量Ｄｓを画像毎に１つ符号化し、位置情報をランレングス符号化する場合の例である。
【００４７】
まず、カメラから符号化対象の画像の撮影パラメータ（Ｄｒ、θ、α）を読み込み（ステップＳ１）、この撮影パラメータからカメラの水平方向および垂直方向の動き量Ｄｓを画素数換算で算出し（ステップＳ２）、この動き量Ｄｓを符号化する（ステップＳ３）。
このステップＳ１およびＳ２により画素数算出部１１５を構成する。
【００４８】
以下、ステップＳ４乃至Ｓ８により比較部１１６を構成する。
符号化対象画像の各マクロブロックについて、ステップＳ４乃至Ｓ８を繰り返す。
カメラの動き量Ｄｓと動き検出部１１１で検出した符号化対象ブロックの動きベクトルＭＶの差分を計算し、この差分が予め定められた閾値ＴＨ_ＭＶ以下の場合（ステップＳ４でＹＥＳ）、符号化対象ブロックの位置に１をセットする（ステップＳ５）。
一方、差分が予め定められた閾値ＴＨ_ＭＶより大きい場合（ステップＳ４でＮＯ）、符号化対象ブロックの位置に０をセットする（ステップＳ６）。
すべてのブロックの処理が終わらない場合（ステップＳ７でＮＯ）、符号化対象ブロックを処理していない他のブロックに進め（ステップＳ８）、ステップＳ４に戻る。
【００４９】
すべてのブロックの処理が終わった場合（ステップＳ７でＹＥＳ）、位置情報をランレングス符号化する（ステップＳ９）。ステップＳ３およびＳ９により付加データ符号化部１１７を構成する。
【００５０】
上記の方法では、カメラの動き量Ｄｓと符号化対象ブロックの動きベクトルＭＶの差により、符号化対象ブロックの動きがカメラワークと似ているかどうかを判定していた。
この判定方法以外では、動き検出部１１１にカメラの動き量Ｄｓを入力することにより、カメラの動き量Ｄｓを動きベクトルとした時の予測誤差量を別途計算し、この値と動き検出部１１１で求めた動きベクトルＭＶの予測誤差量とを比較することで、符号化対象ブロックの動きがカメラワークと似ているかどうかを判定することも可能である。
【００５１】
また、上記の方法では、符号化対象ブロックの動きがカメラワークと似ているかどうかのみを判定しているが、さらに予測モード決定部１１３で選択された予測モードも判定条件に加え、符号化対象ブロックの動きがカメラワークと似ており、且つ、イントラ予測モードと判定されたブロックのみ、動画像復号装置側で動き量Ｄｓが必要なブロックとして符号化することも可能である。
【００５２】
次に、本実施形態１に係る動画像復号装置について説明する。この動画像復号装置は、動画像符号化装置から出力される符号化データおよび付加データを用いて復号処理する。
【００５３】
図４は、本実施形態１に係る動画像復号装置のブロック図であり、同図において、動画像復号装置は、入力バッファ４０１、可変長復号部４０２、逆量子化部４０３、逆直交変換部４０４、加算部４０５、ループ内フィルタ４０６、フレームメモリ４０７、イントラ予測部４０８、動き補償部４０９、切替スイッチ４１０、付加データ復号部４１１、ポスト処理部４１２とから構成される。
【００５４】
以上の構成のうち、可変長復号部４０２、逆量子化部４０３、逆直交変換部４０４、加算部４０５、ループ内フィルタ４０６、フレームメモリ４０７、イントラ予測部４０８、動き補償部４０９、切替スイッチ４１０は、Ｈ.２６４準拠の動画像復号装置の公知の構成であるので、簡単に説明する。本実施形態１の動画像復号装置の構成では、付加データがなくてもＨ.２６４準拠の符号化データを正常に復号することができる。
【００５５】
入力された符号データは、一旦、入力バッファ４０１に蓄積され、必要に応じて、動画像符号化データが可変長復号部４０２に提供され、付加データが付加データ復号部４１１に提供される。
【００５６】
可変長復号部４０２は、動画像符号化データを復号し、各マクロブロックの予測モード、動きベクトル、量子化スケールおよび量子化された変換係数を分離する。
また、可変長復号部４０２は、量子化スケールおよび量子化された変換係数を逆量子化部４０３に出力し、予測モードを切替スイッチ４１０に出力し、動きベクトルを動き補償部４０９に出力する。
【００５７】
逆量子化部４０３は、可変長復号部４０２から入力した変換係数を逆量子化し、逆直交変換部４０４に出力する。
逆直交変換部４０４は、逆量子化部４０３から入力した逆量子化された変換係数を逆直交変換し、加算部４０５に出力する。
加算部４０５は、切替スイッチ４１０から出力された予測ブロックと逆直交変換部４０４の出力である予測誤差ブロックとを加算して、イントラ予測部４０８、ループ内フィルタ４０６に出力する。
ループ内フィルタ４０６は、加算部４０５から入力した復号ブロックをフィルタリングして、フレームメモリ４０７に格納する。
【００５８】
イントラ予測部４０８は、加算部４０５から入力した同一画像内の復号画素値からイントラ予測した予測ブロックを切替スイッチ４１０へ出力する。
動き補償部４０９は、フレームメモリ４０７に格納された復号画像と、可変長復号部４０２で復号された動きベクトルにより予測ブロックを算出して切替スイッチ４１０へ出力する。
【００５９】
切替スイッチ４１０は、可変長復号部４０２から入力された予測モードがイントラ予測モードである場合には、イントラ予測部４０８からの予測ブロックを選択し、予測モードがインター予測モードである場合には、動き補償部４０９からの予測ブロックを選択して、加算部４０５に出力する。
【００６０】
以下、本実施形態１の動画像復号装置の特有な構成である、付加データ復号部４１１、ポスト処理部４１２について、詳細に説明する。
【００６１】
付加データ復号部４１１は、入力バッファ４０１から入力した付加データを復号し、カメラの撮影パラメータの関連情報（カメラの動き量Ｄｓおよび動き量Ｄｓが必要なブロックの位置情報）をポスト処理部４１２に出力する。
【００６２】
ポスト処理部４１２は、カメラの撮影パラメータの関連情報を用いてポスト処理を行う。このポスト処理部４１２は、図５に示すように、合成部５０１、動き補償部５０２、位置情報判定部５０３から構成される。ここで、動き補償部５０２は、図４の動き補償部４０９と同じ機能である。
【００６３】
位置情報判定部５０３は、カメラの動き量Ｄｓが必要なブロックの位置情報から、動き補償部５０２での処理が必要なブロックを抽出し、水平および垂直方向の位置情報に変換して動き補償部５０２に位置情報を通知する。
【００６４】
動き補償部５０２は、動き補償をするブロックの位置情報、動きベクトルに相当するカメラの動き量Ｄｓ、フレームメモリ４０７に格納された参照画像を用いて、復号対象ブロックに対する動き補償ブロックを生成し合成部５０１に出力する。
【００６５】
合成部５０１は、動き補償部５０２から入力した動き補償ブロックと、フレームメモリ４０７（Ｈ.２６４準拠の動画像復号装置）から入力した復号画像中の対応する復号ブロックとをで合成して再生画像を出力する。
しかし、合成部５０１は、カメラの動き量Ｄｓが不必要なブロックでは動き補償は行わず、Ｈ.２６４準拠の動画像復号装置で復号された復号ブロックをそのまま再生画像として出力する。
【００６６】
ここで、復号ブロックと動き補償ブロックとの合成方法は、双方のブロックの平均でも良いし、Ｈ.２６４準拠の動画像復号装置で復号した符号化対象ブロックの予測モードの種類やブロックサイズに依存して、重み付き和のような適応的な処理を行っても良いし、復号ブロックを動き補償ブロックに置き換えても良い。
【００６７】
このポスト処理は、再生直前の画像の各ブロックに対して行うため、フレームメモリ４０７に格納された予測に用いる参照画像には影響を及ぼさないため、動画像復号装置における自由な処理が可能である。
【００６８】
以上のように、Ｈ.２６４準拠の符号化データとは別に、カメラの動き量を付加データとして動画像復号装置に提供することによって、Ｈ.２６４と互換性を保ったままで、動画像復号装置の性能に合わせて再生画像に高画質化の処理を行うことができる。
一方、動画像復号装置がポスト処理を行わない場合、あるいは、ポスト処理を有しない場合でも、Ｈ.２６４準拠の符号化データから復号された最低限の画質は保証される。
【００６９】
また、動画像符号化装置側でカメラの動き量に即したブロックの位置情報を送るようにしたので、動画像復号装置側でこのようなブロックの位置を推定するよりも正確な位置情報を得ることができる。このとき、動画像符号化装置側では、コスト計算の結果として最適な符号化データを作成しているが、画像１枚当たり動きベクトル１つ分の符号量と、１ブロック当たり最大１ビットの符号量を追加するだけで、視覚的にも最適な画像を動画像復号装置側で再生処理することができる。
【００７０】
＜実施形態２＞
本実施形態２では、ＧＯＰ単位で発生するフリッカを抑制するために必要な情報を付加データとして符号化する動画像符号化装置およびその符号化データを復号する動画像復号装置について説明する。
本実施形態２に係る動画像符号化装置の構成は、図１と同じであるため、以下では相違する点についてのみ説明する。
【００７１】
キーフレームでは、通常、イントラ予測モードのみを用いて画像を符号化するため、動きに関する情報は符号化しないが、本実施形態２では符号化対象ブロックのカメラの動き量あるいは動き検出部１１１で求めた動きベクトルを付加データとして符号化する。
【００７２】
次に、キーフレームにおける付加データの処理手順を図６のフローチャートを用いて説明する。
まず、カメラから符号化対象の画像の撮影パラメータ（Ｄｒ、θ、α）を読み込み、この撮影パラメータからカメラの水平方向および垂直方向の動き量Ｄｓを画素数換算で算出する（ステップＳ１１）。このステップＳ１１により画素数算出部１１５を構成する。
【００７３】
以下、ステップＳ１２乃至Ｓ２０により比較部１１６を構成する。
カメラの動き量Ｄｓが予め定められた閾値ＦＴＨより大きければ（ステップＳ１２でＮＯ）、動きの早いシーンでありフリッカは発生しないと解釈し、特別な処理を行わない。
一方、カメラの動き量Ｄｓが予め定められた閾値ＦＴＨより小さければ（ステップＳ１２でＹＥＳ）、符号化対象の画像の各マクロブロックについて、ステップＳ１４乃至Ｓ１９を繰り返す。
【００７４】
動き量Ｄｓを符号化し（ステップＳ１３）、通常のインター予測モードと同様、動き検出部１１１で対象ブロックの動きベクトルＭＶを求める（ステップＳ１４）。
カメラの動き量Ｄｓと動きベクトルＭＶとの差分を計算し、この差分が予め定められた閾値ＦＭＶ以下の場合（ステップＳ１５でＹＥＳ）、対象ブロックはフリッカが目立ちやすいブロックとして解釈されて、対象ブロックの位置に１をセットする（ステップＳ１６）。
一方、差分が予め定められた閾値ＦＭＶより大きい場合（ステップＳ１５でＮＯ）、対象ブロックの位置に０をセットする（ステップＳ１７）。
すべてのブロックの処理が終わらない場合（ステップＳ１８でＮＯ）、対象ブロックを処理していない他のブロックに進め（ステップＳ１９）、ステップＳ１４に戻る。
【００７５】
すべてのブロックの処理が終わった場合（ステップＳ１８でＹＥＳ）、位置情報をランレングス符号化する（ステップＳ２０）。このステップＳ２０により付加データ符号化部１１７を構成する。
【００７６】
上記以外のフリッカが目立ちやすいブロックを判定する方法としては、キーフレームと過去フレームの動きベクトルを各々比較する方法がある。
これは、キーフレームの直前のＰあるいはＢピクチャの動きベクトルＭＶpを１画面分格納しておき、同じ位置ブロックにおけるキーフレームの動きベクトルＭＶと過去フレームの動きベクトルＭＶpを比較し、差が小さい場合はキーフレームの前後で大きな動きはないと判断し、フリッカが目立ちやすいブロックと解釈する。
この場合、動画像復号装置側でキーフレーム直前の過去フレームにおける動きベクトルを保持し、動画像符号化装置側でキーフレームにおいてもブロック毎に動きベクトルＭＶを付加データとして符号化する必要がある。
【００７７】
このように、フリッカが発生しやすいキーフレームに対しては、可変長符号化部１０３から出力されるＨ.２６４準拠の符号化データとは別に、カメラの動き量Ｄｓあるいは符号化対象ブロックの動きベクトルＭＶおよび動き情報が必要なブロックの位置を付加データとして符号化することで、動画像復号装置側にフリッカを抑制するために必要な情報を送ることができる。
【００７８】
次に、実施形態２に係る動画像復号装置について説明する。この動画像復号装置は、実施形態２に係る動画像符号化装置が出力する符号化データおよび付加データを用いて復号処理する。
本実施形態２に係る動画像復号装置の構成は、図４および図５と同じであるため、以下では相違する点についてのみ説明する。
【００７９】
実施形態２に係る動画像符号化装置では、Ｈ.２６４準拠の符号化データとは別に、キーフレームにおいてフリッカの目立ちやすいブロックと、そのブロックの動きベクトルあるいは対応するカメラの動き量を付加データとして符号化している。
【００８０】
ポスト処理部４１２では、付加データ復号部４１１で復号されたカメラの動き量Ｄｓ、およびフリッカが目立つブロックの位置情報がキーフレームにおいてのみ入力され、実施形態１の動画像復号装置と同様に動作する。しかし、フリッカが目立たないブロックでは動き補償は行わず、Ｈ.２６４準拠の動画像復号装置で復号された復号ブロックが再生ブロックとしてそのまま出力される。
【００８１】
以上のように、カメラの撮影パラメータ等を付加データとして利用することで、動画像符号化装置側では動き検出のような計算量の大きい処理については従来の構成をそのまま利用するだけで新しい構成要素を追加することなく、動画像復号装置側に符号化歪を抑制処理するための情報や再生画像を高画質化処理するための情報を送ることができる。動画像復号装置では、装置自体のもつ処理能力に従って符号化歪の抑制処理や再生画像の高画質化処理を実行するかどうか、あるいはどの程度の処理を実行するかを選択できる。
【００８２】
例えば、動画像復号装置のポスト処理において、動き補償ブロックのテクスチャと復号ブロックとを合成することで、キーフレームの前後での画質変動を抑制し、フリッカの目立たない再生画像を作成する処理を行える。
また、キーフレームのみに関係する付加データの追加であるため、ＧＯＰ構造には関係せず、様々なアプリに適用させることができる。
【００８３】
さらに、フリッカが目立ちやすいブロックの動きベクトルＭＶを付加データとする場合にも、動き補償部５０２に入力する動きベクトルをブロック毎に変更するだけで対応することができる。
【００８４】
＜実施形態３＞
上述の実施形態１および２では、カメラワークによって画像全体に動きがある場合を対象としたが、本実施形態３では、カメラワークによる全体的な動きではなく、画像の一部のみに動きがある場合を対象とするため、カメラパラメータを用いない。
【００８５】
図７は、本実施形態３に係る動画像符号化装置の構成を示すブロック図である。図７のうち図１と同じ機能については、同じ符号を付して説明を省略する。図７と図１の相違点は、図１の画素数算出部１１５が削除され、予測モード決定部１１３および比較部１１６の動作が異なる点である。
【００８６】
予測モード決定部１１３は、上述した実施形態１や２と同様にして予測モードを決定する。しかし、イントラ予測モードが選択された場合でも、インター予測誤差量も小さいときがあり、これは、符号化対象ブロックに動きがあるにも関わらず、予測誤差量の求め方に依存してイントラ予測モードが選択されたものと考えられる。
【００８７】
そこで、予測モード決定部１１３では、決定された予測モードに加えて、イントラ予測部１１０で算出された予測誤差量Ｄｉｎｔｒａと動き検出部１１１（あるいは動き補償部１１２）で算出された予測誤差量Ｄｉｎｔｅｒの差分値Ｄｍｏｄｅを比較部１１６に出力する。
Ｄｍｏｄｅ＝｜Ｄｉｎｔｒａ−Ｄｉｎｔｅｒ｜
【００８８】
比較部１１６では、予測モード決定部１１３から入力した差分値Ｄｍｏｄｅが予め定められた閾値ＴＨＭＯＤＥ以下のブロックの位置と、このブロックに関して動き検出部１１１で検出した動きベクトルＭＶを付加データ符号化部１１７に出力する。
付加データ符号化部１１７では、比較部１１６から入力した動きベクトルＭＶと、そのブロックの位置情報を実施形態１と同様の方法で符号化する。
【００８９】
次に、実施形態３における付加データの処理手順を図８のフローチャートを用いて説明する。
まず、イントラ予測部１１０で計算された予測誤差量Ｄｉｎｔｒａと動き検出部１１１（あるいは動き補償部１１２）で計算された予測誤差量Ｄｉｎｔｅｒを入力し（ステップＳ３１）、予測誤差量Ｄｉｎｔｒａと予測誤差量Ｄｉｎｔｅｒの差分値Ｄｍｏｄｅを計算するとともに（ステップＳ３２）、上述した実施形態１や２と同様にして予測モードを決定する（ステップＳ３３）。
このステップＳ３１乃至Ｓ３３により予測モード決定部１１３を構成する。
【００９０】
符号化対象の画像の各マクロブロックについて、ステップＳ３４乃至Ｓ３９を繰り返す。
対象ブロックの予測モードがイントラ予測モード以外の場合（ステップＳ３４でＮＯ）、ステップＳ３７に進む。
一方、対象ブロックの予測モードがイントラ予測モードの場合（ステップＳ３４でＹＥＳ）、且つ、差分値Ｄｍｏｄｅが予め定められた閾値ＴＨＭＯＤＥ以下の場合（ステップＳ３５でＹＥＳ）、対象ブロックの位置に１をセットし、対象ブロックの位置と対応付けて動きベクトルＭＶを格納する（ステップＳ３６）。
他方、差分値Ｄｍｏｄｅが予め定められた閾値ＴＨＭＯＤＥより大きい場合（ステップＳ３５でＮＯ）、対象ブロックの位置に０をセットする（ステップＳ３７）。
すべてのブロックの処理が終わらない場合（ステップＳ３８でＮＯ）、対象ブロックを処理していない他のブロックに進め（ステップＳ３９）、ステップＳ３４に戻る。
以上、ステップＳ３４乃至Ｓ３９により比較部１１６を構成する。
【００９１】
すべてのブロックの処理が終わった場合（ステップＳ３８でＹＥＳ）、位置情報をランレングス符号化し（ステップＳ４０）、位置情報が１のブロックについて動きベクトルＭＶを符号化する（ステップＳ４１〜Ｓ４４）。
このステップＳ４０乃至Ｓ４３により付加データ符号化部１１７を構成する。
【００９２】
以上で説明したように、イントラ予測モードとインター予測モードの予測誤差量の差が小さいブロックでは、イントラ予測モードを使って作成したＨ.２６４準拠の符号化データとは別に、インター予測モードが選択されたときに用いるはずの動きベクトルを符号化して提供することで、動画像復号装置側に局所的な動きに対する情報を提供することができる。
ここで、符号化データと別に送付される付加データは、位置情報と動きベクトルのみであり、この追加によって必要となるデータは非常に小さい。
【００９３】
また、付加データとして符号化される動きベクトルは、予測モード選択のためにもともと算出したものであり、この動きベクトルを付加するために新たな動き検出処理や、動き補償処理を行う必要はない。これによって、動画像復号装置側では、動画像符号化装置で作成された正確な動き情報を用いて、ポスト処理で再生画像を高画質化することが可能となり、また、ポスト処理方法も動きベクトルや位置情報に基づいて多様な方法を適用することができる。
【００９４】
次に、実施形態３に係る動画像復号装置について説明する。この動画像復号装置は、実施形態３に係る動画像符号化装置が出力する符号化データおよび付加データを用いて復号処理する。
【００９５】
本実施形態３における付加データは、イントラ予測モードとインター予測モードの予測誤差の差が小さく、且つ、イントラ予測モードが選択されたブロックに対し、そのブロックの位置情報とインター予測モードが選択された場合の動きベクトルである。
従って、本実施形態３に係る動画像復号装置の構成は、図４および図５と同じであるが、図５の動き補償部５０２に入力される動きベクトルがブロック毎に異なるという違いがある。
【００９６】
動画像復号装置では、Ｈ.２６４準拠の符号化データとは別に、イントラ予測モードとインター予測モードの予測誤差の差が小さく、且つ、イントラ予測モードが選択されたブロックに対し、付加データとしてインター予測モードが選択された場合の動きベクトルが提供されるので、性能に合わせて再生画像を高画質化することができる。
【００９７】
尚、本発明は上述した実施形態に限定されず、本発明の要旨を逸脱しない範囲内で各種の変形、修正が可能であるのは勿論である。【Technical field】
[0001]
The present invention relates to a moving picture coding apparatus and a moving picture decoding apparatus capable of performing an encoding / decoding process according to the performance of the apparatus.
[Background]
[0002]
MPEG-4 AVC / H.264 (hereinafter abbreviated as H.264), which is the latest international standard for moving picture coding technology, is an improvement over the conventional coding technology by improving intra prediction technology and inter prediction technology. Prediction performance is improved.
[0003]
For example, in the intra prediction technique, conventionally, prediction is performed on transform coefficients after DCT (Discrete Cosine Transform), and there are three types of prediction methods: DC prediction and AC prediction in the horizontal / vertical direction.
However, in H.264, prediction is performed on pixel values before DCT, and prediction methods are 9 types in units of 4 × 4 blocks, 9 types in units of 8 × 8 blocks, and 4 types in units of 16 × 16 blocks. Since the most suitable prediction mode can be selected from many prediction modes, the prediction accuracy is increased.
[0004]
In the inter prediction technique, the motion compensation is conventionally only 16 × 16 block size and 8 × 8 block size.
However, in H.264, there are seven types of 16 × 16, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 4 × 8, and 4 × 4, and the prediction accuracy is increased.
[0005]
Since the prediction accuracy has been improved in this way, even when realizing the same level of image quality as before, the prediction error is reduced and the amount of code required for encoding is reduced. When the width is rough, there is a problem that distortion different from the conventional one appears.
[0006]
For example, even when the whole image moves due to camera work such as panning, or when only a part of the image moves, the performance of inter prediction has outperformed the performance of intra prediction, and almost no motion region Inter prediction was selected.
However, H. In H.264, since the prediction accuracy in the same screen has been improved, intra prediction is selected even in a moving region in a flat region with poor texture.
[0007]
In this way, when a block that has selected intra prediction and a block that has selected inter prediction are mixed, the fineness of the texture is different, so the visual impression is worse, and intra prediction is selected instead of inter prediction. In such a case, it has become difficult to reproduce the texture of the image.
[0008]
Further, conventionally, when content is encoded, key frames (or fields, hereinafter collectively referred to as frames) are periodically inserted for random access and high-speed playback.
H. In H.264, an IDR (Instantaneous Decoder Refresh) picture is used as the key frame, and a closed GOP (Group Of Pictures) that prohibits a picture decoded after the IDR picture from referring to a picture decoded before the IDR picture. Often takes the structure.
[0009]
In this case, in still scenes with high complexity textures and slowly moving panning scenes, noise differences affect the in-loop filter strength and the selected intra prediction mode for each IDR picture. However, it appears as a distortion recognized by the user as flicker in the IDR picture insertion interval (GOP) unit.
[0010]
In order to solve the above problem, in Non-Patent Document 1, in a scene in which the entire screen is moved by camera work, the motion of the entire screen or an intra prediction block in the vicinity of the encoding target block is determined when determining the prediction mode for P and B pictures. A method of selecting an intra / inter prediction mode is proposed in consideration of the ratio.
[0011]
Further, in Patent Document 1, when it is determined that GOP-based flicker is easily recognized, a method of encoding using an open GOP instead of a closed GOP or a method of selecting an intra / inter prediction mode is changed, A method has been proposed for avoiding sudden changes in the image in the time direction that are recognized as distortion (flicker).
[0012]
Further, in Patent Document 2, the decoding device side adjusts the filtering strength in the time and space directions based on the information of the encoded data, and blurs the texture of the portion determined to be prone to generate flicker ( A method for suppressing flicker is proposed.
[Non-Patent Document 1]
Toshinobu Yoshino and two others, “A Study on Improvement of Subjective Image Quality in H.264 / MPEG-4 AVC Coding”, Proceedings of the 2006 Annual Conference of the Video Information Media Society of Japan, Image Information Media Society of Japan, December 2006, No. 1-1
[Patent Document 1]
JP 2007-214785 A
[Patent Document 2]
JP 2006-229411 A
DISCLOSURE OF THE INVENTION
[Problems to be solved by the invention]
[0013]
However, Non-Patent Document 1 has the following problems.
・ A new calculation of the movement of the entire screen is required, which increases the amount of processing.
-As a result of the cost calculation, since the block that is originally determined to have a smaller code amount when the intra prediction is selected is changed to the inter prediction, the code amount may increase.
-Since the scene moves on the entire screen, it cannot be applied to a scene in which only a part of the screen is a moving area.
[0014]
Moreover, Patent Document 1 has the following problems.
When changing to an open GOP, a picture that cannot be decoded normally occurs, so additional processing such as preventing the picture that could not be decoded normally from being displayed is required.
-It cannot be applied to applications that cannot support open GOP such as 1Seg.
[0015]
Further, in Patent Document 2, since the filter strength is partially changed, there is a problem that flicker is conspicuous when it is erroneously determined, and the filter strength is difficult to adjust.
[0016]
The present invention has been made in consideration of the above-mentioned circumstances, and while maintaining compatibility with the standard MPEG-4 AVC / H.264, it suppresses coding distortion with respect to a moving region and increases the quality of a reproduced image. An object of the present invention is to provide a moving image encoding device and a moving image decoding device that can add the above process.
[Means for Solving the Problems]
[0017]
In order to solve the above problems, the present invention is configured as follows.
A moving image encoding device is a moving image encoding device that divides a moving image into small blocks, selects either an intra prediction mode or an inter prediction mode as a prediction mode for each block, and outputs encoded data. A motion amount calculation unit that calculates a motion amount (corresponding to a motion vector) of an image corresponding to the motion of the camera based on a shooting parameter of the camera, and the motion vector and the motion amount used in the inter prediction mode for each block And an additional data encoding unit that encodes the amount of motion and position information of the corresponding block according to the comparison result, and stores and adds the encoded information separately from the encoded data. Compared with the case where only the encoded data is decoded, the reproduced image is improved in image quality by adding the decoding process of the additional data.
[0018]
For the above moving picture coding apparatus, the moving picture is divided into small blocks, and coded data encoded by selecting either the intra prediction mode or the inter prediction mode as the prediction mode for each block is obtained. In addition to decoding processing, the motion amount and the position information of the corresponding block are encoded according to the comparison result between the motion vector used in the inter prediction mode for each block and the motion amount of the image corresponding to the motion of the camera. A video decoding device that acquires and decodes additional data stored and added separately from the encoded data, and decodes the additional data stored and added separately from the encoded data An additional data decoding unit that obtains a motion amount and block position information, a motion compensation unit that generates a motion compensation block from the motion amount, and a block of the position information A combining unit for combining the image of those for the coding and decoding the processed image and the motion compensation unit in the generated motion compensation block of the block of data, and comprising a.
[0019]
Furthermore, as another configuration, the moving image encoding apparatus divides the moving image into small blocks, selects either the intra prediction mode or the inter prediction mode as the prediction mode for each block, and outputs encoded data. In the moving image encoding device, for a key frame, a motion amount calculation unit that calculates a motion amount of an image corresponding to the motion of the camera based on a shooting parameter of the camera, and when the motion amount is a predetermined value or less, A comparison unit that compares the motion vector used in the inter prediction mode for each block and the motion amount, and encodes the motion amount and position information of the corresponding block according to the comparison result, And an additional data encoding unit for storing and adding separately.
[0020]
For the video encoding apparatus having the other configuration described above, the video is divided into small blocks, and is encoded by selecting either the intra prediction mode or the inter prediction mode as the prediction mode for each block. In addition to decoding the encoded data, when the motion amount of the image corresponding to the motion of the camera is equal to or less than a predetermined value for the key frame, the motion vector used in the inter prediction mode for each block and the motion amount A moving picture decoding apparatus that encodes the amount of motion and position information of a corresponding block according to a comparison result, stores the data separately from the encoded data, acquires the additional data added, and performs decoding processing. For the frame, the additional data stored and added separately from the encoded data is decoded, and the additional data recovery for acquiring the motion amount and the block position information is performed. A motion compensation block that generates a motion compensation block from the motion amount, an image of a block of the encoded data corresponding to the position information block, and a motion compensation block generated by the motion compensation unit And a synthesizing unit that synthesizes the images.
Effect of the invention
[0021]
In accordance with the present invention, the standard MPEG-4 AVC / H. While maintaining compatibility with H.264, it is possible to add processing for suppressing encoding distortion for a moving region and improving the quality of a reproduced image.
[Brief description of the drawings]
[0022]
FIG. 1 is a block diagram showing a configuration of a video encoding apparatus according to Embodiment 1.
FIG. 2 is a diagram for explaining position information of a block that requires a motion amount or a motion vector.
FIG. 3 is a flowchart for explaining the additional data processing procedure of the video encoding apparatus according to the first embodiment.
FIG. 4 is a block diagram showing a configuration of a moving picture decoding apparatus according to Embodiment 1.
FIG. 5 is a block diagram illustrating a configuration of a post processing unit of the video decoding device according to the first embodiment.
FIG. 6 is a flowchart for explaining the additional data processing procedure of the video encoding apparatus according to the second embodiment.
FIG. 7 is a block diagram showing a configuration of a video encoding apparatus according to Embodiment 3.
FIG. 8 is a flowchart for explaining the additional data processing procedure of the video encoding apparatus according to the third embodiment.
Explanation of symbols
[0023]
DESCRIPTION OF SYMBOLS 101 ... Orthogonal transformation part, 102 ... Quantization part, 103 ... Variable length coding part, 104 ... Rate control part, 105 ... Inverse quantization part, 106 ... Inverse orthogonal transformation part, 107 ... Adder, 108 ... In-loop filter , 109 ... frame memory, 110 ... intra prediction unit, 111 ... motion detection unit, 112 ... motion compensation unit, 113 ... prediction mode determination unit, 114 ... subtraction unit, 115 ... pixel number calculation unit (motion amount calculation unit), 116 ... comparison unit, 117 ... additional data encoding unit, 118 ... encoded data reconstruction unit,
401 ... input buffer, 402 ... variable length decoding unit, 403 ... inverse quantization unit, 404 ... inverse orthogonal transform unit, 405 ... addition unit, 406 ... in-loop filter, 407 ... frame memory, 408 ... intra prediction unit, 409 ... Motion compensation unit, 410 ... changeover switch, 411 ... additional data decoding unit, 412 ... post-processing unit, 501 ... synthesis unit, 502 ... motion compensation unit, 503 ... position information determination unit.
BEST MODE FOR CARRYING OUT THE INVENTION
[0024]
Hereinafter, preferred embodiments of the present invention will be described with reference to the drawings.
<Embodiment 1>
FIG. 1 is a block diagram illustrating a configuration of a video encoding apparatus according to the first embodiment. In FIG. 1, the moving image coding apparatus includes an orthogonal transform unit 101, a quantization unit 102, a variable length coding unit 103, a rate control unit 104, an inverse quantization unit 105, an inverse orthogonal transform unit 106, an addition unit 107, a loop. Inner filter 108, frame memory 109, intra prediction unit 110, motion detection unit 111, motion compensation unit 112, prediction mode determination unit 113, subtraction unit 114, pixel number calculation unit (motion amount calculation unit) 115, comparison unit 116, addition The data coding unit 117 and the coded data reconstruction unit 118 are configured.
[0025]
Among the above configurations, the orthogonal transform unit 101, the quantization unit 102, the variable length coding unit 103, the rate control unit 104, the inverse quantization unit 105, the inverse orthogonal transform unit 106, the addition unit 107, the in-loop filter 108, the frame Since the memory 109, the intra prediction unit 110, the motion detection unit 111, the motion compensation unit 112, the prediction mode determination unit 113, and the subtraction unit 114 are well-known configurations in the H.264-compliant moving image encoding device, they will be briefly described. .
[0026]
In the H.264-compliant encoding method, a frame constituting a moving image stream is divided into macroblock units, and encoding is executed for each macroblock. Then, encoding is executed for each macroblock by determining the intra prediction mode, the inter prediction mode, or any other mode.
[0027]
In the intra prediction mode, a prediction block of an encoding target block is generated, and a prediction error block calculated from the encoding target block and the prediction block is encoded by orthogonal transform, quantization, and variable length encoding.
In the inter prediction mode, the positional relationship between the encoding target block and the corresponding area on the reference frame is detected as a motion vector, and the area on the reference frame that is displaced from the encoding target frame by the motion vector is predicted. As a block, a prediction error block calculated from the encoding target block and the prediction block is encoded by orthogonal transform, quantization, and variable length encoding.
[0028]
Hereinafter, each component will be described.
The subtraction unit 114 calculates a prediction error block between the encoding target block of the input image and the prediction block input from the prediction mode determination unit 113 and outputs the prediction error block to the orthogonal transform unit 101.
The orthogonal transform unit 101 outputs transform coefficients generated by performing orthogonal transform such as discrete cosine transform on the prediction error block input from the subtraction unit 114 to the quantization unit 102.
[0029]
Based on the quantization parameter input from the rate control unit 104, the quantization unit 102 quantizes and transforms the transform coefficient output from the orthogonal transform unit 101 at a quantization scale defined according to the quantization parameter. The data is output to long coding section 103 and inverse quantization section 105.
The rate control unit 104 monitors the amount of code generated by the variable length encoding unit 103, determines a quantization parameter for matching with the target code amount, and outputs this to the quantization unit 102.
[0030]
The variable length coding unit 103 performs variable length coding on the quantized transform coefficient input from the quantization unit 102 and stores the result in a buffer (not shown). At this time, the variable length encoding unit 103 encodes the motion vector input from the motion compensation unit 112 when the prediction mode determined by the prediction mode determination unit 113 and the prediction mode is inter prediction, and Store in header data.
Further, the variable length coding unit 103 includes the quantization scale used in the quantization in the quantization unit 102 in each macroblock.
[0031]
The inverse quantization unit 105 performs inverse quantization on the quantized transform coefficient based on the quantization scale used in the quantization unit 102 and outputs the quantized transform coefficient to the inverse orthogonal transform unit 106.
The inverse orthogonal transform unit 106 performs inverse orthogonal transform corresponding to the orthogonal transform of the orthogonal transform unit 101 on the inversely quantized transform coefficient input from the inverse quantization unit 105, and outputs the result to the addition unit 107.
[0032]
The addition unit 107 adds the prediction block determined by the prediction mode determination unit 113 and the inversely transformed prediction error input from the inverse orthogonal transform unit 106, and outputs the result to the in-loop filter 108.
The in-loop filter 108 removes block distortion from the locally decoded image input from the adder 107 and writes it into the frame memory 109 as a reference frame.
[0033]
The intra prediction unit 110 performs intra prediction on each macroblock from the locally decoded pixel values in the same image for each of a plurality of predefined intra prediction modes, determines an optimal prediction mode, The prediction mode and prediction block of the case are output to the prediction mode determination unit 113.
[0034]
The motion detection unit 111 detects a motion vector of the encoding target block of the input image and outputs the detected motion vector to the motion compensation unit 112, the variable length encoding unit 103, and the comparison unit 116.
The motion detection unit 111 searches for a portion similar to the current block in a predetermined search range of the reference frame stored in the frame memory 109, and detects a spatial movement amount from the current block as a motion vector.
[0035]
The motion compensation unit 112 generates a prediction block using the motion vector input from the motion detection unit 111 and the reference frame stored in the frame memory 109, and sends the prediction mode and the prediction block to the prediction mode determination unit 113. Output.
[0036]
The prediction mode determination unit 113 compares the prediction block input from the intra prediction unit 110 and the prediction block input from the motion compensation unit 112 with each encoding target block, and selects a prediction block determined to have a small prediction error. While outputting to the subtraction part 114, the selected intra or inter prediction mode is output to the variable length encoding part 103.
[0037]
Next, the pixel number calculation unit 115, the comparison unit 116, the additional data encoding unit 117, and the encoded data reconstruction unit 118, which are unique configurations of the moving image encoding apparatus according to the first embodiment, will be described in detail.
[0038]
The amount of motion (converted to the number of pixels) Ds of the camera can be calculated by the following equation. Here, Dr is a focal length at the time of shooting, θ is an angle of the shooting direction (lens direction) in the horizontal or vertical direction of the camera, and α is a ratio for expressing the shot image with a predetermined image size. .
Ds = Dr × tan θ × α
[0039]
The camera motion amount Ds is substantially the same as the motion vector MV detected by the motion detector 111 when the subject to be photographed is stationary and only the camera pan is used.
The pixel number calculation unit 115 calculates the amount of movement Ds of the camera in the horizontal direction and the vertical direction from the shooting parameters of the camera as described above, and outputs it to the comparison unit 116.
[0040]
The comparison unit 116 calculates a difference between the motion amount Ds of the camera and the motion vector MV detected by the motion detection unit 111, and the difference is a predetermined threshold TH. _MV In the following case, the encoding target block is interpreted as moving similar to camera work, and the position information of the encoding target block and the motion amount Ds are output to the additional data encoding unit 117. Further, a threshold TH in which the difference is determined in advance _MV If it is larger, nothing is output.
[0041]
The additional data encoding unit 117 encodes the position information output from the comparison unit 116 and the camera motion amount Ds, and outputs the encoded information to the encoded data reconstruction unit 118.
Since the camera motion amount Ds is one vector for one image, one image may be encoded for each image.
In addition, as a method of encoding the position information of the block that requires the motion amount Ds, as shown in FIG. 2, the position of each block is made to correspond to one bit of the plane, the necessary block is 1, and the unnecessary block is There is a method of performing run-length encoding with 0 being zero.
[0042]
The encoded data reconstruction unit 118 adds the additional data encoded by the additional data encoding unit 117 to the buffer in which the encoded data created by the variable length encoding unit 103 is stored, and encodes the encoded data. Reconstruct the data.
There are the following methods for storing the additional data in the buffer.
[0043]
A) When storing separately from encoded data of moving images:
The encoded data is represented as a packet composed of a packet header and packet data when multiplexed with voice, text data, or the like. For example, when MPEG-2 System is used as the multiplexing method, the packet header includes data identifier (stream_id) stored in the packet data, reproduction output time management information (PTS: Presentation Time Stamp), Decoding time management information (DTS: Decoding Time Stamp) and the like are stored.
The packet data stores data indicated by an identifier indicated by stream_id.
[0044]
When storing additional data, private_stream_1 (10111101 in binary) or metadata_stream (11111100 in binary) is set in the stream_id field of the packet header, and the additional data encoded by the additional data encoding unit 117 is packetized. Store as data.
Furthermore, the reproduction output time and decoding time of the encoded data synchronized with the additional data are stored in the PTS and DTS of the packet header, respectively.
[0045]
B) When storing in moving image encoded data:
In H.264, redundant pictures can be stored in addition to encoded data of normal image data (main picture).
In this case, the position information and the motion amount Ds output from the comparison unit 116 are converted into the H.264 encoded data format and stored in the redundant picture.
[0046]
Next, the additional data processing procedure will be described with reference to the flowchart of FIG. FIG. 3 shows an example in which one camera motion amount Ds is encoded for each image and position information is run-length encoded.
[0047]
First, the imaging parameters (Dr, θ, α) of the image to be encoded are read from the camera (step S1), and the horizontal and vertical motion amount Ds of the camera is calculated in terms of the number of pixels from the imaging parameters (step S1). S2), this motion amount Ds is encoded (step S3).
The pixel number calculation unit 115 is configured by these steps S1 and S2.
[0048]
Hereinafter, the comparison unit 116 is configured by steps S4 to S8.
Steps S4 to S8 are repeated for each macroblock of the encoding target image.
The difference between the camera motion amount Ds and the motion vector MV of the encoding target block detected by the motion detection unit 111 is calculated, and this difference is determined by a predetermined threshold TH. _MV In the following case (YES in step S4), 1 is set at the position of the encoding target block (step S5).
On the other hand, the threshold value TH in which the difference is predetermined _MV If larger (NO in step S4), 0 is set to the position of the encoding target block (step S6).
When the processing of all the blocks is not completed (NO in step S7), the block to be encoded is advanced to another block not processed (step S8), and the process returns to step S4.
[0049]
When all the blocks have been processed (YES in step S7), the position information is run-length encoded (step S9). The additional data encoding unit 117 is configured by steps S3 and S9.
[0050]
In the above method, it is determined whether the motion of the encoding target block is similar to the camera work based on the difference between the camera motion amount Ds and the encoding target block motion vector MV.
Other than this determination method, by inputting the camera motion amount Ds to the motion detection unit 111, a prediction error amount when the camera motion amount Ds is used as a motion vector is separately calculated. It is also possible to determine whether the motion of the encoding target block is similar to camera work by comparing the obtained prediction error amount of the motion vector MV.
[0051]
In the above method, it is determined only whether the motion of the encoding target block is similar to camera work, but the prediction mode selected by the prediction mode determination unit 113 is also added to the determination condition, and the encoding target Only a block whose block motion is similar to camera work and determined to be in the intra prediction mode can be encoded as a block that requires the motion amount Ds on the video decoding device side.
[0052]
Next, the moving picture decoding apparatus according to the first embodiment will be described. This moving image decoding apparatus performs decoding processing using encoded data and additional data output from the moving image encoding apparatus.
[0053]
FIG. 4 is a block diagram of the video decoding apparatus according to the first embodiment. In FIG. 4, the video decoding apparatus includes an input buffer 401, a variable length decoding unit 402, an inverse quantization unit 403, and an inverse orthogonal transform unit. 404, an addition unit 405, an in-loop filter 406, a frame memory 407, an intra prediction unit 408, a motion compensation unit 409, a changeover switch 410, an additional data decoding unit 411, and a post processing unit 412.
[0054]
Among the above configurations, the variable length decoding unit 402, the inverse quantization unit 403, the inverse orthogonal transform unit 404, the addition unit 405, the in-loop filter 406, the frame memory 407, the intra prediction unit 408, the motion compensation unit 409, and the changeover switch 410. Is a known configuration of a video decoding device compliant with H.264, and will be described briefly. In the configuration of the moving picture decoding apparatus according to the first embodiment, encoded data compliant with H.264 can be normally decoded without additional data.
[0055]
The input code data is temporarily stored in the input buffer 401, moving image encoded data is provided to the variable length decoding unit 402, and additional data is provided to the additional data decoding unit 411 as necessary.
[0056]
The variable length decoding unit 402 decodes the moving image encoded data, and separates the prediction mode, motion vector, quantization scale, and quantized transform coefficient of each macroblock.
The variable length decoding unit 402 outputs the quantization scale and the quantized transform coefficient to the inverse quantization unit 403, outputs the prediction mode to the changeover switch 410, and outputs the motion vector to the motion compensation unit 409.
[0057]
The inverse quantization unit 403 performs inverse quantization on the transform coefficient input from the variable length decoding unit 402 and outputs the transform coefficient to the inverse orthogonal transform unit 404.
The inverse orthogonal transform unit 404 performs inverse orthogonal transform on the inversely quantized transform coefficient input from the inverse quantization unit 403 and outputs the transform coefficient to the addition unit 405.
The adding unit 405 adds the prediction block output from the changeover switch 410 and the prediction error block output from the inverse orthogonal transform unit 404, and outputs the result to the intra prediction unit 408 and the in-loop filter 406.
The in-loop filter 406 filters the decoded block input from the adding unit 405 and stores it in the frame memory 407.
[0058]
The intra prediction unit 408 outputs the prediction block predicted intra from the decoded pixel value in the same image input from the addition unit 405 to the changeover switch 410.
The motion compensation unit 409 calculates a prediction block from the decoded image stored in the frame memory 407 and the motion vector decoded by the variable length decoding unit 402 and outputs the prediction block to the changeover switch 410.
[0059]
The changeover switch 410 selects a prediction block from the intra prediction unit 408 when the prediction mode input from the variable length decoding unit 402 is the intra prediction mode, and when the prediction mode is the inter prediction mode, The prediction block from the motion compensation unit 409 is selected and output to the addition unit 405.
[0060]
Hereinafter, the additional data decoding unit 411 and the post processing unit 412 which are unique configurations of the moving image decoding apparatus according to the first embodiment will be described in detail.
[0061]
The additional data decoding unit 411 decodes the additional data input from the input buffer 401, and sends related information (camera motion amount Ds and block position information that requires the motion amount Ds) to the post processing unit 412. Output.
[0062]
The post processing unit 412 performs post processing using information related to the imaging parameters of the camera. As illustrated in FIG. 5, the post processing unit 412 includes a synthesis unit 501, a motion compensation unit 502, and a position information determination unit 503. Here, the motion compensation unit 502 has the same function as the motion compensation unit 409 in FIG.
[0063]
The position information determination unit 503 extracts a block that needs to be processed by the motion compensation unit 502 from the position information of the block that requires the camera motion amount Ds, converts the block into position information in the horizontal and vertical directions, and the motion compensation unit The position information is notified to 502.
[0064]
The motion compensation unit 502 generates and synthesizes a motion compensation block for the decoding target block using the position information of the block for motion compensation, the camera motion amount Ds corresponding to the motion vector, and the reference image stored in the frame memory 407. Output to the unit 501.
[0065]
The synthesizing unit 501 synthesizes the motion compensation block input from the motion compensation unit 502 and the corresponding decoded block in the decoded image input from the frame memory 407 (H.264-compliant moving image decoding apparatus), and reproduces the reproduced image. Is output.
However, the synthesizing unit 501 does not perform motion compensation for a block that does not require the camera motion amount Ds, and outputs the decoded block decoded by the H.264-compliant moving image decoding apparatus as it is as a reproduced image.
[0066]
Here, the synthesis method of the decoded block and the motion compensation block may be an average of both blocks, or depends on the type of prediction mode and the block size of the encoding target block decoded by the video decoding device compliant with H.264. Then, adaptive processing such as weighted sum may be performed, or the decoded block may be replaced with a motion compensation block.
[0067]
Since this post processing is performed on each block of the image immediately before reproduction, it does not affect the reference image used for prediction stored in the frame memory 407, and thus free processing in the video decoding device is possible. .
[0068]
As described above, by providing the moving amount of the camera as additional data to the moving image decoding apparatus separately from the H.264 compliant encoded data, the moving image decoding apparatus maintains compatibility with H.264. In accordance with the performance, it is possible to perform processing for improving the image quality of the reproduced image.
On the other hand, even when the moving picture decoding apparatus does not perform post processing or does not have post processing, the minimum image quality decoded from encoded data compliant with H.264 is guaranteed.
[0069]
Also, since the position information of the block corresponding to the amount of motion of the camera is sent on the moving image encoding device side, more accurate position information is obtained than estimating the position of such a block on the moving image decoding device side. be able to. At this time, on the moving image encoding device side, optimal encoded data is created as a result of cost calculation, but the code amount for one motion vector per image and the code of 1 bit at maximum per block By simply adding the amount, a visually optimal image can be reproduced on the moving image decoding apparatus side.
[0070]
<Embodiment 2>
In the second embodiment, a moving picture encoding apparatus that encodes information necessary for suppressing flicker occurring in GOP units as additional data and a moving picture decoding apparatus that decodes the encoded data will be described.
Since the configuration of the moving picture encoding apparatus according to the second embodiment is the same as that in FIG. 1, only differences will be described below.
[0071]
In a key frame, since an image is normally encoded using only the intra prediction mode, information on motion is not encoded. In the second embodiment, the motion amount of the camera of the encoding target block or the motion detection unit 111 is used. The obtained motion vector is encoded as additional data.
[0072]
Next, the additional data processing procedure in the key frame will be described with reference to the flowchart of FIG.
First, the imaging parameters (Dr, θ, α) of the image to be encoded are read from the camera, and the horizontal and vertical movement amounts Ds of the camera are calculated in terms of the number of pixels from the imaging parameters (step S11). This step S11 constitutes the pixel number calculation unit 115.
[0073]
Hereinafter, the comparison unit 116 is configured by steps S12 to S20.
If the camera motion amount Ds is larger than a predetermined threshold FTH (NO in step S12), it is interpreted that the scene is fast moving and flicker does not occur, and no special processing is performed.
On the other hand, if the camera motion amount Ds is smaller than a predetermined threshold FTH (YES in step S12), steps S14 to S19 are repeated for each macroblock of the image to be encoded.
[0074]
The motion amount Ds is encoded (step S13), and the motion vector MV of the target block is obtained by the motion detection unit 111 as in the normal inter prediction mode (step S14).
If the difference between the camera motion amount Ds and the motion vector MV is calculated and this difference is equal to or less than a predetermined threshold value FMV (YES in step S15), the target block is interpreted as a block where flicker is conspicuous, and the target block 1 is set at the position (step S16).
On the other hand, when the difference is larger than the predetermined threshold value FMV (NO in step S15), 0 is set to the position of the target block (step S17).
When the processing of all the blocks is not finished (NO in step S18), the target block is advanced to another block not processed (step S19), and the process returns to step S14.
[0075]
If all blocks have been processed (YES in step S18), the position information is run-length encoded (step S20). This step S20 constitutes the additional data encoding unit 117.
[0076]
As a method of determining a block where flicker is conspicuous other than the above, there is a method of comparing the motion vectors of the key frame and the past frame.
This is a case where the motion vector MVp of the P or B picture immediately before the key frame is stored for one screen, the motion vector MVp of the key frame in the same position block is compared with the motion vector MVp of the past frame, and the difference is small. Judge that there is no significant movement before and after the key frame, and interpret it as a block where flicker is conspicuous.
In this case, it is necessary to hold the motion vector in the past frame immediately before the key frame on the moving image decoding device side, and to encode the motion vector MV as additional data for each block in the key frame on the moving image encoding device side.
[0077]
As described above, for a key frame in which flicker is likely to occur, separately from the H.264-compliant encoded data output from the variable-length encoding unit 103, the motion amount Ds of the camera or the motion of the encoding target block By encoding the position of the block requiring the vector MV and motion information as additional data, it is possible to send information necessary for suppressing flicker to the moving image decoding apparatus side.
[0078]
Next, a video decoding device according to the second embodiment will be described. This moving image decoding apparatus performs decoding processing using encoded data and additional data output from the moving image encoding apparatus according to the second embodiment.
Since the configuration of the moving picture decoding apparatus according to the second embodiment is the same as that in FIGS. 4 and 5, only differences will be described below.
[0079]
In the moving picture encoding apparatus according to the second embodiment, apart from H.264-compliant encoded data, a block in which a flicker is conspicuous in a key frame and a motion vector of the block or a corresponding camera motion amount are used as additional data. Encoding.
[0080]
The post processing unit 412 receives the camera motion amount Ds decoded by the additional data decoding unit 411 and the position information of the block where the flicker is conspicuous only in the key frame, and operates in the same manner as the moving image decoding apparatus of the first embodiment. . However, motion compensation is not performed for blocks where flicker is inconspicuous, and the decoded blocks decoded by the H.264-compliant video decoding device are output as they are as playback blocks.
[0081]
As described above, by using the shooting parameters of the camera as additional data, a new component can be obtained simply by using the conventional configuration as it is for the processing with a large calculation amount such as motion detection on the moving image encoding device side. The information for suppressing the coding distortion and the information for improving the image quality of the reproduced image can be sent to the moving image decoding apparatus without adding the. In the moving picture decoding apparatus, it is possible to select whether or not to execute encoding distortion suppression processing and playback image quality improvement processing according to the processing capability of the apparatus itself, and to what extent processing is executed.
[0082]
For example, in the post-processing of the video decoding device, by synthesizing the texture of the motion compensation block and the decoding block, it is possible to perform processing that suppresses fluctuations in image quality before and after the key frame and creates a reproduced image with less noticeable flicker. .
In addition, since additional data relating only to the key frame is added, it can be applied to various applications regardless of the GOP structure.
[0083]
Furthermore, even when the motion vector MV of a block in which flicker is conspicuous is used as additional data, it can be dealt with by simply changing the motion vector input to the motion compensation unit 502 for each block.
[0084]
<Embodiment 3>
In the first and second embodiments described above, the case where there is a movement in the entire image due to camera work is targeted, but in this third embodiment, there is a movement in only a part of the image, not in the whole movement due to camera work. Camera parameters are not used to deal with cases.
[0085]
FIG. 7 is a block diagram showing the configuration of the moving picture encoding apparatus according to the third embodiment. In FIG. 7, the same functions as those in FIG. The difference between FIG. 7 and FIG. 1 is that the pixel number calculation unit 115 in FIG. 1 is deleted, and the operations of the prediction mode determination unit 113 and the comparison unit 116 are different.
[0086]
The prediction mode determination unit 113 determines the prediction mode in the same manner as in the first and second embodiments described above. However, even when the intra prediction mode is selected, the inter prediction error amount may be small. This is because the intra prediction error depends on the method for obtaining the prediction error amount although there is motion in the encoding target block. The mode is considered selected.
[0087]
Therefore, in addition to the determined prediction mode, the prediction mode determination unit 113 includes the prediction error amount Dintra calculated by the intra prediction unit 110 and the prediction error amount Dinter calculated by the motion detection unit 111 (or motion compensation unit 112). The difference value Dmode is output to the comparison unit 116.
Dmode = | Dintra-Dinter |
[0088]
The comparison unit 116 adds the position of the block in which the difference value Dmode input from the prediction mode determination unit 113 is equal to or less than a predetermined threshold THMODE and the motion vector MV detected by the motion detection unit 111 for this block to the additional data encoding unit 117. Output to.
The additional data encoding unit 117 encodes the motion vector MV input from the comparison unit 116 and the position information of the block by the same method as in the first embodiment.
[0089]
Next, the additional data processing procedure in the third embodiment will be described with reference to the flowchart of FIG.
First, the prediction error amount Dintra calculated by the intra prediction unit 110 and the prediction error amount Dinter calculated by the motion detection unit 111 (or motion compensation unit 112) are input (step S31), and the prediction error amount Dintra and the prediction error amount are input. The difference value Dmode of Dinter is calculated (step S32), and the prediction mode is determined in the same manner as in the first and second embodiments (step S33).
The prediction mode determination unit 113 is configured by steps S31 to S33.
[0090]
Steps S34 to S39 are repeated for each macroblock of the image to be encoded.
When the prediction mode of the target block is other than the intra prediction mode (NO in step S34), the process proceeds to step S37.
On the other hand, when the prediction mode of the target block is the intra prediction mode (YES in step S34) and the difference value Dmode is equal to or smaller than a predetermined threshold value THMODE (YES in step S35), 1 is set to the position of the target block. Then, the motion vector MV is stored in association with the position of the target block (step S36).
On the other hand, when the difference value Dmode is larger than the predetermined threshold value THMODE (NO in step S35), 0 is set to the position of the target block (step S37).
When the processing of all the blocks is not completed (NO in step S38), the process proceeds to another block that has not processed the target block (step S39), and the process returns to step S34.
As described above, the comparison unit 116 is configured by steps S34 to S39.
[0091]
When all the blocks have been processed (YES in step S38), the position information is run-length encoded (step S40), and the motion vector MV is encoded for the block having the position information of 1 (steps S41 to S44).
The additional data encoding unit 117 is configured by steps S40 to S43.
[0092]
As described above, the inter prediction mode is selected separately from the H.264-compliant encoded data created using the intra prediction mode for blocks with a small difference in the prediction error amount between the intra prediction mode and the inter prediction mode. By encoding and providing a motion vector that should be used at the time, information on local motion can be provided to the moving picture decoding apparatus side.
Here, the additional data sent separately from the encoded data is only the position information and the motion vector, and the data required by this addition is very small.
[0093]
The motion vector encoded as the additional data is originally calculated for selecting the prediction mode, and it is not necessary to perform a new motion detection process or a motion compensation process in order to add the motion vector. As a result, the moving picture decoding device can use the accurate motion information created by the moving picture coding device to improve the quality of the reproduced image by post processing, and the post processing method also uses a motion vector. Various methods can be applied based on the location information.
[0094]
Next, a video decoding device according to Embodiment 3 will be described. This moving image decoding apparatus performs decoding processing using encoded data and additional data output from the moving image encoding apparatus according to the third embodiment.
[0095]
The additional data in the third embodiment has a small difference in prediction error between the intra prediction mode and the inter prediction mode, and the position information of the block and the inter prediction mode are selected for the block for which the intra prediction mode is selected. Motion vector.
Therefore, the configuration of the moving picture decoding apparatus according to the third embodiment is the same as that shown in FIGS. 4 and 5, except that the motion vector input to the motion compensation unit 502 shown in FIG. 5 differs for each block.
[0096]
In the moving picture decoding apparatus, apart from the H.264-compliant encoded data, the difference in prediction error between the intra prediction mode and the inter prediction mode is small, and the block for which the intra prediction mode is selected is interlaced as additional data. Since the motion vector when the prediction mode is selected is provided, the quality of the reproduced image can be improved in accordance with the performance.
[0097]
Note that the present invention is not limited to the above-described embodiment, and various modifications and corrections can be made without departing from the scope of the present invention.

Claims

A moving image encoding apparatus that divides a moving image into small blocks, selects either an intra prediction mode or an inter prediction mode as a prediction mode for each block, and outputs encoded data. A motion amount calculating unit that calculates a motion amount (corresponding to a motion vector) of an image corresponding to the motion of the camera based on a comparison unit that compares the motion vector and the motion amount used in the inter prediction mode for each block; An additional data encoding unit that encodes the amount of motion and position information of the corresponding block according to the comparison result, and stores and adds the encoded information separately from the encoded data. A video encoding device characterized in that the additional data is processed according to the performance of the video decoding device and the quality of the reproduced image can be improved while maintaining compatibility with the video.

A moving image encoding apparatus that divides a moving image into small blocks, selects either an intra prediction mode or an inter prediction mode as a prediction mode for each block, and outputs encoded data. A motion amount calculation unit that calculates a motion amount (corresponding to a motion vector) of an image corresponding to the motion of the camera based on the shooting parameters, and when the motion amount is equal to or less than a predetermined value, an inter prediction mode for each block A comparison unit that compares the used motion vector and the amount of motion, and an addition that encodes the amount of motion and position information of the corresponding block according to the comparison result, and stores and adds the encoded information separately from the encoded data A data encoding unit, processing additional data according to the performance of the video decoding device and maintaining the compatibility with existing video decoding devices, Video encoding apparatus characterized by enabling image quality of.

The moving image is divided into small blocks, the encoded data encoded by selecting either the intra prediction mode or the inter prediction mode as the prediction mode for each block is decoded, and the inter prediction mode is performed for each block. The encoded data is obtained by encoding the motion amount and the position information of the corresponding block in accordance with the comparison result between the motion vector used in the above and the motion amount of the image corresponding to the motion of the camera (corresponding to the motion vector). A video decoding device that acquires and decodes additional data separately stored and added, decodes the additional data stored and added separately from the encoded data, An additional data decoding unit that acquires position information; a motion compensation unit that generates a motion compensation block from the amount of motion; and the block corresponding to the block of position information A synthesizing unit that synthesizes an image of the decoded block of the encoded data and an image of the motion compensation block generated by the motion compensation unit, and compared with a case where only the encoded data is decoded, A moving picture decoding apparatus characterized in that an image quality of a reproduced image is improved by adding a decoding process.

The moving image is divided into small blocks, the encoded data encoded by selecting either the intra prediction mode or the inter prediction mode as the prediction mode for each block is decoded, and the motion of the camera for the key frame When the motion amount (corresponding to a motion vector) of the image corresponding to is less than or equal to a predetermined value, the motion amount corresponds to the motion amount according to the comparison result between the motion vector used in the inter prediction mode for each block and the motion amount. A moving picture decoding apparatus that encodes block position information, stores the data separately from the encoded data, and acquires and adds the additional data added thereto, and decodes the key frame separately from the encoded data The additional data decoding unit that decodes the additional data stored and added to obtain the motion amount and the block position information, and the motion amount A motion compensation unit that generates a compensation block, and a composition that synthesizes an image of a block obtained by decoding the encoded data corresponding to the block of position information and an image of the motion compensation block generated by the motion compensation unit A moving picture decoding apparatus characterized in that the quality of a reproduced image is improved by adding an additional data decoding process as compared with a case where only encoded data is decoded.