JP4507418B2

JP4507418B2 - Image information conversion apparatus and image information conversion method

Info

Publication number: JP4507418B2
Application number: JP2001039099A
Authority: JP
Inventors: 数史佐藤; 邦明高橋; 輝彦鈴木; 陽一矢ケ崎
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2001-02-15
Filing date: 2001-02-15
Publication date: 2010-07-21
Anticipated expiration: 2021-02-15
Also published as: JP2002247586A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像情報変換装置及び画像情報変換方法に関し、特に、離散コサイン変換等の直交変換と動き補償によって圧縮された画像情報（ビットストリーム）を、衛星放送、ケーブルＴＶ、インターネット等のネットワークを介して受信する際、或いは、光ディスク、磁気ディスク、フラッシュメモリ等の記憶媒体上で処理する際に用いられる画像情報変換装置及び画像情報変換方法に関する。
【０００２】
【従来の技術】
近年、画像情報をディジタルデータとして取り扱う際、画像情報特有の冗長性を利用し、効率の高い情報の伝送及び蓄積を目的とした、例えば離散コサイン変換（Discrete Cosine Transform、以下、ＤＣＴと記す。）等の直交変換と動き補償により圧縮する方式に準拠した装置が、放送局などの情報配信及び一般家庭における情報受信の双方において普及しつつある。
【０００３】
特に、ＭＰＥＧ（Moving Picture Experts Group）によって標準化されているＭＰＥＧ２は、汎用画像符号化方式としてＩＳＯ／ＩＥＣ１３８１８−２に定義されており、飛び越し走査画像及び順次走査画像の双方、並びに標準解像度画像及び高精細画像を網羅している。そのためＭＰＥＧ２は、プロフェッショナル用途からコンシューマ用途まで、広範なアプリケーションに今後とも用いられるものと予想される。
【０００４】
このようなＭＰＥＧ２圧縮方式を用いることにより、例えば７２０×４８０画素をもつ標準解像度の飛び越し走査画像であれば４〜８Ｍｂｐｓの符号量（以下、ビットレートと記す。）を、１９２０×１０８８画素をもつ高解像度の飛び越し走査画像であれば１８〜２２Ｍｂｐｓのビットレートを割り当てることで、高い圧縮率と良好な画質の実現が可能である。
【０００５】
ＭＰＥＧ２は、主として放送用に適合する高画質符号化を対象としていたが、ＭＰＥＧ１よりも低いビットレート、つまり、より高い圧縮率の符号化方式には対応していなかった。ところが携帯端末の普及とともに、今後より高い圧縮率の符号化方式のニーズは高まると予想されたことからＭＰＥＧ４符号化方式の標準化が行われ、画像符号化方式に関しては、１９９８年１２月にＩＳＯ／ＩＥＣ１４４９６−２として国際標準に承認されている。
【０００６】
このように、ディジタル放送に対応するように一旦符号化されたＭＰＥＧ２画像圧縮情報（以下、ＭＰＥＧ２ビットストリームと記す。）を、例えば、携帯端末等で処理するために、さらに低いビットレートのＭＰＥＧ４画像圧縮情報（以下、ＭＰＥＧ４ビットストリームと記す。）に変換するための画像情報変換装置として、図５に示す装置がField-to-Frame Transcoding with Spatial and Temporal Downsampling (Susie J.wee,John G.Apostolopoulos,and Nick Feamster,ICIP'99)（以下「文献１」という。）に提案されている。
【０００７】
図５に示す画像情報変換装置１００において、入力端子１１０に供給された飛び越し走査のＭＰＥＧ２ビットストリームにおける各フレームのデータは、まず、ピクチャタイプ判別部１１１に入力する。
【０００８】
当該ピクチャタイプ判別部１１１では、各フレームの入力データがＩピクチャ（画像内符号化画像）及びＰピクチャ（前方予測符号化画像）に関するものか、又はＢピクチャ（両方向予測符号化画像）に関するものであるかを判別し、前者のときのみ、そのＩ及びＰピクチャに関する情報を後続のＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１１２に出力する。
【０００９】
ＭＰＥＧ２画像情報復号化部１１２における処理は通常のＭＰＥＧ２画像情報復号化装置と同様である。ただし、Ｂピクチャに関するデータはピクチャタイプ判別部１１１において廃棄されるため、ＭＰＥＧ２画像情報復号化部１１２における機能としてはＩ／Ｐピクチャのみを復号化できればよい。ＭＰＥＧ２画像情報復号化部１１２の出力となる画素値は、間引き部１１３に入力される。
【００１０】
当該間引き部１１３は、水平方向については１／２の間引き処理を施し、垂直方向については第一フィールド若しくは第二フィールドのどちらか一方のデータのみを残し、もう一方を廃棄することにより、入力となる画像情報の１／４の大きさをもつ順次走査画像を生成する。間引き部１１３によって生成された順次走査画像は、一旦、ビデオメモリ１１４に蓄積された後に読み出され、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１１５に入力する。
【００１１】
ここで、例えば、入力となるＭＰＥＧ２ビットストリームがＮＴＳＣ（National Television System Committee）の規格に準拠したもの、つまり７２０×４８０画素、３０Ｈｚの飛び越し走査画像であった場合、上記間引き後の画枠は３６０×２４０画素ということになるが、後続のＭＰＥＧ２画像情報符号化部１１５において符号化を行う際、マクロブロック単位の処理を行うには、水平方向、垂直方向ともに、その画素数が１６の倍数である必要がある。したがって、このための画素の補填若しくは廃棄を間引き部１１３において同時に行う。すなわちこのときの間引き部１１３は、上記画素の補填若しくは廃棄として、例えば水平方向の右端若しくは左端の８ラインを廃棄して３５２×２４０画素とする。
【００１２】
上記ＭＰＥＧ２画像情報符号化部１１５では、入力した順次走査画像の信号を符号化してＭＰＥＧ４ビットストリームを生成し、そのＭＰＥＧ４画像圧縮情報が出力端子１１８から後段へ出力される。その際、入力となるＭＰＥＧ２ビットストリーム内の動きベクトル情報は、動きベクトル合成部１１６において間引き後の画像情報に対する動きベクトルにマッピングされ、また、動きベクトル検出部１１７では、動きベクトル合成部１１６において合成された動きベクトル値を元に高精度の動きベクトルを検出する。なお、ＭＰＥＧ４において、ＶＯＰ（Video Object Plane）とは、オブジェクトを囲む１つ又は複数のマクロブロックから構成される領域を表し、ＭＰＥＧ２におけるフレームに相当するものである。このＶＯＰの領域は、符号化される方式にしたがって、Ｉピクチャ、Ｐピクチャ、及びＢピクチャのうちの何れかに分類される。Ｉ−ＶＯＰ（ＩピクチャのＶＯＰ）は、動き補償を行うことなく、画像（領域）そのものが符号化（イントラ符号化）されるものである。Ｐ−ＶＯＰ（ＰピクチャのＶＯＰ）は、基本的には、自身より時間的に前に位置する画像（Ｉ又はＰ−ＶＯＰ）に基づいて、前方予測符号化される。Ｂ−ＶＯＰ（ＢピクチャのＶＯＰ）は、基本的には、自身より時間的に前と後ろに位置する２つの画像（Ｉ又はＰ−ＶＯＰ）に基づいて両方向予測符号化されるものである。
【００１３】
上述のように、文献１には、入力となるＭＰＥＧ２ビットストリームの１／２×１／２の大きさをもつ順次走査画像のＭＰＥＧ４ビットストリームを生成する装置に関する技術が述べられている。すなわち、入力となるＭＰＥＧ２ビットストリームが例えばＮＴＳＣの規格に準拠したものであった場合、出力となるＭＰＥＧ４画像圧縮情報はＳＩＦサイズ（３５２×２４０）ということになるが、上記図５の構成によれば、上記間引き部１１３における動作の変更を行うことにより、それ以外の画枠、例えば上記の例では約１／４×１／４の画枠であるＱＳＩＦ（１７６×１１２画素）サイズの画像に変換することも可能となっている。
【００１４】
また、文献１には、ＭＰＥＧ２画像情報復号化部１１２における処理として、水平方向、垂直方向それぞれについて、入力となるＭＰＥＧ２ビットストリーム内の、８次の離散コサイン変換係数すべてを用いた復号処理を行う装置について述べられているが、図５に示した装置に関してはその限りではなく、水平方向のみ、或いは水平方向、垂直方向ともに、８次の離散コサイン変換係数のうちの低域成分のみを用いた復号処理を行い、画質劣化を最小限に抑えながら、復号処理に伴う演算量とビデオメモリ容量を削減することが可能とされている。
【００１５】
ところで、図５に示した画像情報変換装置では、ＭＰＥＧ２画像情報符号化部１１５においてＰ−ＶＯＰの符号化を行う際に、各マクロブロックを、ＭＰＥＧ４に規定されるイントラ（Ｉｎｔｒａ）マクロブロックとして符号化するか、１６×１６画素のインター（Ｉｎｔｅｒ）マクロブロックとして符号化するか、或いは、８×８画素のインター４Ｖ（Ｉｎｔｅｒ４Ｖ）マクロブロックとして符号化するかの符号化モードのタイプ判定を行う必要がある。
【００１６】
ここで、モード判定の一般的な手法としては、MPEG-4 Video Verification Model（ISO/IEC JTC1/SC29/WG11 N2932、以下これを文献２とする）において定められた手法を用いることが考えられる。
【００１７】
以下、図６を参照しながら、文献２において述べられているモード判定（図５のＭＰＥＧ２画像情報符号化部１１５で行われるモード判定）の手法について述べる。
【００１８】
まず、ステップＳ１０１として、動きベクトル検出により、インター動きベクトル及びインター４Ｖ動きベクトルを求める。
【００１９】
次に、ステップＳ１０２において、インター動きベクトル及びインター４Ｖ動きベクトルにより生成される予測画をそれぞれＲｅｆ_{Ｉｎｔｅｒ}、Ｒｅｆ_{Ｉｎｔｅｒ４Ｖ}で表し、原画をＯｒｇで表すとし、さらに、ステップＳ１０３において、当該マクロブロックをインターマクロブロック、及びインター４Ｖマクロブロックとして符号化した場合の予測誤差ＥＲＲ_{Ｉｎｔｅｒ}、ＥＲＲ_{Ｉｎｔｅｒ４Ｖ}をそれぞれ式（１）及び式（２）によって算出し、また、当該マクロブロックに含まれる画素の平均値をＭｅａｎ＿ＭＢとして、当該マクロブロックをイントラマクロブロックとして符号化した場合の予測誤差ＥＲＲ_{Ｉｎｔｒａ}を、式（３）のように定義する。
ＥＲＲ_{Ｉｎｔｅｒ}＝ＳＡＤ（Ｏｒｇ−Ｒｅｆ_{Ｉｎｔｅｒ}）（１）
ＥＲＲ_{Ｉｎｔｅｒ４Ｖ}＝ＳＡＤ（Ｏｒｇ−Ｒｅｆ_{Ｉｎｔｅｒ４Ｖ}）（２）
ＥＲＲ_{Ｉｎｔｒａ}＝ＳＡＤ（Ｏｒｇ−Ｍｅａｎ＿ＭＢ）（３）
なお、式中のＳＡＤは絶対値誤差和（Sum of Absolute Difference）を表す。
【００２０】
次に、ステップＳ１０４として、上記式（１）及び式（２）で求められた予測誤差ＥＲＲ_{Ｉｎｔｅｒ}、ＥＲＲ_{Ｉｎｔｅｒ４Ｖ}から、インターマクロブロックとして符号化するのと、インター４Ｖマクロブロックとして符号化するのと、どちらの符号化効率がよいかの判定を行う。すなわち、式（４）が成立すれば、インターマクロブロックとして符号化した方が符号化効率がよいとし、成立しなければインター４Ｖマクロブロックとして符号化した方が符号化効率がよいと判定する。
ＥＲＲ_{Ｉｎｔｅｒ}−Ｏｆｆｓｅｔ＜ＥＲＲ_{Ｉｎｔｅｒ４Ｖ} （４)
なお、この式（４）において、Ｏｆｆｓｅｔはインターマクロブロックを選ばれ易くするためのパラメータで、文献２においては１２９と定められている。
【００２１】
次に、式（４）によってインターマクロブロックが選ばれた場合、パラメータＥＲＲを式（５）のように定義し、一方、インター４Ｖマクロブロックが選ばれた場合、パラメータＥＲＲを式（６）のように定義する。
ＥＲＲ＝ＥＲＲ_{Ｉｎｔｅｒ} （５）
ＥＲＲ＝ＥＲＲ_{Ｉｎｔｅｒ４Ｖ} （６）
次にマクロブロックに含まれる餓死の平均値をＭｅａｎ＿ＭＢとして、当該マクロブロックをイントラマクロブロックとして符号化した場合の予測誤差ＥＲＲ_{Ｉｎｔｒａ}を式（７）のように定義する。
ＥＲＲ_{Ｉｎｔｒａ}＝ＳＡＤ（Ｏｒｇ−Ｍｅａｎ＿ＭＢ）（７）
次に、上記パラメータＥＲＲ及び前記式（３）により定義される予測誤差ＥＲＲ_{Ｉｎｔｒａ}から、当該マクロブロックをイントラマクロブロックとして符号化するのと、式（４）によって選択されたマクロブロックモードで符号化するのとではどちらが符号化効率が高いかの判定を行う。
【００２２】
すなわち式（８）が成立すれば、イントラマクロブロックとして符号化する方が効率がよいとし、成立しなければ式（３）によって選択されたマクロブロックモードで符号化する方が効率がよいとする。
ＥＲＲ_{Ｉｎｔｒａ}＜ＥＲＲ（８）
つまり、図６のステップＳ１０５において、ＥＲＲ_{Ｉｎｔｅｒ}＜ＥＲＲ_{Ｉｎｔｒａ}が成立しないときにはステップＳ１０８としてイントラマクロブロックモードで符号化する方が効率がよいとし、成立したときにはステップＳ１０７としてインターマクロブロックモードで符号化する方が効率がよいとする。また、ステップＳ１０６において、ＥＲＲ_{Ｉｎｔｅｒ４Ｖ}＜ＥＲＲ_{Ｉｎｔｒａ}が成立しないときにはステップＳ１０８としてイントラマクロブロックモードで符号化する方が効率がよいとし、成立したときにはステップＳ１０９としてインター４Ｖマクロブロックモードで符号化する方が効率がよいとする。
【００２３】
なお、前述の動きベクトル合成部１１６及び動きベクトル検出部１１７における動作原理に関しては様々な方法が考えられるが、図５に示した画像情報変換装置２のように、入力端子１１０に供給されたＭＰＥＧ２ビットストリームの１／２×１／２の画枠をもつＭＰＥＧ４ビットストリームを出力する場合には、例えば図７に示すような流れで動きベクトルの合成及び検出を行うことが考えられる。
【００２４】
すなわち、動きベクトル合成部１１６では、まずステップＳ１１１として、入力されたＭＰＥＧ２ビットストリームから動きベクトル情報を抽出し、次に、ステップＳ１１２として、当該抽出されたＭＰＥＧ２画像圧縮情報に対する動きベクトル情報のスケーリング及び時間補正を行うことで、出力となるＭＰＥＧ４ビットストリームに対するインター４Ｖ動きベクトル情報を合成する。さらに、動きベクトル合成部１１６では、ステップＳ１１３として、上記生成されたインター４Ｖ動きベクトルの平均値若しくは代表値をインター動きベクトルとする。
【００２５】
次に、動きベクトル検出部１１７では、ステップＳ１１４として、動きベクトル合成部１１６で生成されたインター動きベクトル及びインター４Ｖ動きベクトルについて、その周辺の数画素をサーチし、それらインター動きベクトル及びインター４Ｖ動きベクトルの高精度化を行う。このようにして高精度化された動きベクトルが前記図６のＭＰＥＧ４画像情報符号化部１１５に送られる。
【００２６】
【発明が解決しようとする課題】
しかしながら、図６に示したマクロブロックモード判別法では、動き補償及び絶対値誤差和（ＳＡＤ）の算出を行うため、多くの演算量を必要とする。
【００２７】
そこで、本発明は、このような実情に鑑みてなされたものであり、マクロブロックモード判別を少ない演算量で実現可能とする、画像情報変換装置及び画像情報変換方法を提供することを目的とする。
【００２８】
【課題を解決するための手段】
本発明は、少なくとも画像内符号化画像と画像間予測符号化画像からなる第１の画像符号化情報を第２の画像符号化情報へと変換する画像情報変換装置において、上記第２の画像符号化情報を構成する複数画像からなる各符号化単位に対応する動きベクトル情報を生成する動きベクトル生成手段と、上記生成された動きベクトル情報を格納する動きベクトル格納手段と、上記第１の画像符号化情報から抽出される動きベクトル情報に基づいて生成された第１の画像間予測符号化モードの符号化単位の動きベクトルと第２の画像間予測符号化モードの符号化単位の動きベクトルとに基づいて、上記第１の画像間予測符号化モードと上記第２の画像間予測符号化モードの何れを使用するかを決定する符号化モード判定を行うモード判定手段とを有し、上記モード判定手段は、上記第１の画像間予測符号化モードの符号化単位の動きベクトルと上記第２の画像間予測符号化モードの符号化単位の動きベクトルとの間のｘ方向成分の差分絶対値和とｙ方向成分の差分絶対値和との和として算出される分散値、若しくは上記第１の画像間予測符号化モードの符号化単位の動きベクトルと上記第２の画像間予測符号化モードの符号化単位の動きベクトルとの間のｘ方向成分の差分絶対値和とｙ方向成分の差分絶対値和との最大値として算出される分散値を求め、上記分散値と予め設定した閾値との比較に基づいて、当該第２の画像符号化情報を構成する上記符号化単位についての符号化モードを判定することにより上述した課題を解決する。
【００２９】
また、本発明は、少なくとも画像内符号化画像と画像間予測符号化画像からなる第１の画像符号化情報を第２の画像符号化情報へと変換する画像情報変換方法において、上記第２の画像符号化情報を構成する複数画像からなる各符号化単位に対応する動きベクトル情報を生成する動きベクトル生成工程と、上記生成された動きベクトル情報を動きベクトル格納手段に格納する動きベクトル格納工程と、上記第１の画像符号化情報から抽出される動きベクトル情報に基づいて生成された第１の画像間予測符号化モードの符号化単位の動きベクトルと第２の画像間予測符号化モードの符号化単位の動きベクトルとに基づいて、上記第１の画像間予測符号化モードと上記第２の画像間予測符号化モードの何れを使用するかを決定する符号化モード判定を行うモード判定工程とを有し、上記モード判定工程では、上記第１の画像間予測符号化モードの符号化単位の動きベクトルと上記第２の画像間予測符号化モードの符号化単位の動きベクトルとの間のｘ方向成分の差分絶対値和とｙ方向成分の差分絶対値和との和として算出される分散値、若しくは上記第１の画像間予測符号化モードの符号化単位の動きベクトルと上記第２の画像間予測符号化モードの符号化単位の動きベクトルとの間のｘ方向成分の差分絶対値和とｙ方向成分の差分絶対値和との最大値として算出される分散値を求め、上記分散値と予め設定した閾値との比較に基づいて、当該第２の画像符号化情報を構成する上記符号化単位についての符号化モードを判定することにより、上述した課題を解決する。
【００３０】
【発明の実施の形態】
本発明の実施の形態として示す画像情報変換装置は、少なくとも画像内符号化画像（Ｉピクチャ）と画像間予測符号化画像（Ｂピクチャ）からなるＭＰＥＧ（Moving Picture Experts Group）によって標準化されたＭＰＥＧ２画像圧縮情報をＭＰＥＧ４画像圧縮情報へと変換する画像情報変換装置であって、ＭＰＥＧ２画像圧縮情報から抽出される動きベクトル情報に基づいて生成された第１の画像間予測符号化モードであるインター符号化モードの符号化単位の動きベクトルと第２の画像間予測符号化モードであるインター４Ｖ符号化モードの符号化単位の動きベクトルとに基づいて、インター符号化モードとインター４Ｖ符号化モードの何れを使用するかを決定する符号化モード判定を行うモード判定手段とを有することによって、マクロブロックモード判別をより少ない演算量で実現可能とするものである。
【００３１】
以下、図面を参照し、本発明の実施例について説明する。画像情報変換装置１は、図１に示すように、主に、ピクチャタイプ判別部１３と、圧縮情報解析部１４と、ＭＰＥＧ２画像情報復号化部（Ｉ／Ｐピクチャ）１５と、間引き部１６と、ビデオメモリ１７と、ＭＰＥＧ４画像情報符号化部（Ｉ／Ｐ−ＶＯＰ）１８と、動きベクトル合成部１９と、動きベクトル検出部２０と、Ｉｎｔｒａ／Ｉｎｔｅｒ判定部２１と、Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部２２と、情報バッファ２３とを有して構成されている。
【００３２】
画像情報変換装置１において、入力端子１１に供給された飛び越し走査のＭＰＥＧ２画像圧縮情報（以下、ＭＰＥＧ２ビットストリームと記す。）は、ピクチャタイプ判別部１３に伝送される。
【００３３】
ピクチャタイプ判別部１３は、ピクチャタイプを判別する。すなわち、入力端子１１からのＭＰＥＧ２ビットストリームに対して、Ｉ及びＰピクチャに関する情報は出力して圧縮情報解析部１４へ送るが、Ｂピクチャに関する情報については破棄する。これによりフレームレートの変換が行われる。
【００３４】
圧縮情報解析部１４では、ピクチャタイプ判別部１３から送られてきた画像圧縮情報の構文解析を行うことにより、当該ＭＰＥＧ２ビットストリームの符号化に関連する情報を抽出し、その符号化に関連する情報を情報バッファ２３へ送り、また、当該ＭＰＥＧ２動きベクトル情報を動きベクトル合成部１９へ送り、画像圧縮情報についてはＭＰＥＧ２画像情報復号化部１５へ送る。なお、上記圧縮情報解析部１４により抽出される情報の詳細については後述する。
【００３５】
ＭＰＥＧ２画像情報復号化部１５は、図６に示した装置のものと同等である。Ｂピクチャに関する情報は前段のピクチャタイプ判別部１３において既に破棄されているため、当該ＭＰＥＧ２画像情報復号化部１０の機能としては、Ｉ及びＰピクチャに関する情報のみの復号化処理を行えるものであればよい。ＭＰＥＧ２画像情報復号化部１５の出力となる画素値は、間引き部１６に入力される。
【００３６】
間引き部１６は、水平方向について１／２の間引き処理を施し、垂直方向について第一フィールド、若しくは第二フィールドのどちらか一方のデータのみを残し、もう一方を廃棄することにより、入力となる画像情報の１／４の大きさをもつ順次走査画像を生成する。ここで例えば、入力端子１１へ供給されたＭＰＥＧ２ビットストリームがＮＴＳＣ（National Television System Committee）の規格に準拠したもの、つまり７２０×４８０画素、３０Ｈｚの飛び越し走査画像であった場合、上記間引き後の画枠は３６０×２４０画素ということになる。ただし、後続のＭＰＥＧ２画像情報符号化部１８において符号化を行う際、マクロブロック単位の処理を行うには、水平方向、垂直方向ともに、その画素数が１６の倍数である必要がある。したがって、当該間引き部１６は、上記間引きと同時に、上記画素数を１６の倍数にするための画素の補填若しくは廃棄を同時に行う。すなわち、この例の場合の間引き部１６は、例えば上記３６０×２４０画素に対して、例えば水平方向の右端若しくは左端の８ラインを廃棄することで、上記１６の倍数である３５２×２４０画素の画枠を構成する。当該間引き部１６によって生成された順次走査画像は、一旦、ビデオメモリ１７に蓄積された後、後段のＭＰＥＧ２画像情報符号化部１８、又は後述するＩｎｔｒａ／Ｉｎｔｅｒ判定部２１の要求に応じて読み出される。
【００３７】
また、動きベクトル合成部１９では、ＭＰＥＧ２ビットストリームから取り出されたＭＰＥＧ２動きベクトル情報を間引き後の画像情報に対する動きベクトルにマッピングし、さらに次段の動きベクトル検出部２０では、動きベクトル合成部１９において合成された動きベクトル値と、ビデオメモリ１７に記憶された画像情報とに基づいて、高精度の動きベクトルを検出する。この検出された動きベクトルは、ＭＰＥＧ２画像情報符号化部１８とＩｎｔｒａ／Ｉｎｔｅｒ判定部２１とに送られる。
【００３８】
上記ＭＰＥＧ２画像情報符号化部１８では、情報バッファ２３に保持された前記ＭＰＥＧ２ビットストリームの符号化に関連する情報と、上記動きベクトル検出部２０からの動きベクトル情報と、後述するＩｎｔｒａ／Ｉｎｔｅｒ判定部２１及びＩｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部２２での判定処理により得られたＩｎｔｅｒ／Ｉｎｔｅｒ４Ｖのモード情報とを用い、上記ビデオメモリ１７から供給された順次走査画像の信号を符号化してＭＰＥＧ４ビットストリームを生成する。当該ＭＰＥＧ４ビットストリームは、出力端子１２から後段へ出力される。
【００３９】
情報バッファ２３、Ｉｎｔｒａ／Ｉｎｔｅｒ判定部２１、Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部２２における動作について、以下の図２乃至図３を用いて説明する。まず、Ｉピクチャから変換されるＰ−ＶＯＰ以外の通常のＰ−ＶＯＰに対するモード判定の動作原理について、図２のフローチャート及び図３を参照しながら説明する。
【００４０】
Ｉｎｔｒａ／Ｉｎｔｅｒ判定部２１は、情報バッファ２３に格納された入力となるＭＰＥＧ２ビットストリームから上記符号化に関連する情報を抽出し（ステップＳ１）、抽出された符号化に関する情報に基づいて、最初に、Ｉｎｔｅｒのモード判定を行う（ステップＳ２）。
【００４１】
ここで、上記飛び越し走査のＭＰＥＧ２ビットストリームの約１／２×１／２の画枠をもつ順次走査のＭＰＥＧ４ビットストリームを出力する場合において、例えば図３に示すように、入力端子１１に供給されたＭＰＥＧ２ビットストリームを構成する画像ＳＴＲ２に含まれる４つのマクロブロックＭＢ_{ＭＰＥＧ２，ｉ}（ｉ＝１，２，３，４）が、ＭＰＥＧ４ビットストリームを構成する画像ＳＴＲ４におけるマクロブロックＭＢ_{ＭＰＥＧ４，１}に対応している場合を例にあげて考えることとする。
【００４２】
例えば、図３に示すように入力となるＭＰＥＧ２ビットストリームにおける画像ＳＴＲ２の４つのマクロブロックＭＢ_{ＭＰＥＧ２，ｉ}（ｉ＝１，２，３，４）が、出力となるＭＰＥＧ４ビットストリームにおけるマクロブロックＭＢ_{ＭＰＥＧ４，ｉ}に対応するとき、ＭＢ_{ＭＰＥＧ２，ｉ}（ｉ＝１，２，３，４）のなかでイントラマクロブロックとして符号化されているマクロブロックの個数をＮ_{Ｉｎｔｒａ}、インターマクロブロックとして符号化されているマクロブロックの個数をＮ_{Ｉｎｔｅｒ}とし、以下に示す式（９）が成立する場合、ステップＳ２において、このマクロブロックをイントラマクロブロックとして判定して符号化し、ステップＳ６において、上記出力となるＭＰＥＧ４ビットストリームを構成する画像ＳＴＲ４のマクロブロックをイントラマクロブロックに決定する。
Ｎ_{Ｉｎｔｒａ}≧Ｎ_{Ｉｎｔｅｒ} （９）
また、式（９）が成立しないとき、すなわち、イントラマクロブロックとして符号化されているマクロブロックの個数の方がインターマクロブロックとして符号化されているマクロブロックの個数よりも少ないとき、このマクロブロックをインターマクロブロック、若しくはインター４Ｖマクロブロックとして符号化すると決定しステップＳ３へ進む。
【００４３】
ここでは、前述の式（８）のように、残差を用いたモード判定を行ってもよい。或いは、マクロブロックＭＢ_{ＭＰＥＧ２，ｉ}にそれぞれ対する量子化スケールをＱ_{ＭＰＥＧ２，ｉ}（ｉ＝１，２，３，４）とし、それぞれ割り当てられた符号量（ビット数）をＢ_{ＭＰＥＧ２，ｉ}（ｉ＝１，２，３，４）としたとき、それらマクロブロックＭＢ_{ＭＰＥＧ２，ｉ}に対するコンプレキシティＸ_{ＭＰＥＧ２，ｉ}（ｉ＝１，２，３，４）を、式（１０）により計算し、このコンプレキシティを用いて符号化効率がよいと思われるモードを選択するようにしてもよい。
Ｘ_{ＭＰＥＧ２，ｉ}＝Ｑ_{ＭＰＥＧ２，ｉ}・Ｂ_{ＭＰＥＧ２，ｉ} （１０）
【００４４】
次に、上記Ｉｎｔｒａ／Ｉｎｔｅｒ判定のステップＳ２において、インターマクロブロック若しくはインター４Ｖマクロブロックであると判定されたマクロブロックに関しては、動きベクトル検出部１５に格納された当該ＶＯＰに対するインター動きベクトル及びインター４Ｖ動きベクトルを用いてインター／インター４Ｖを判定する。すなわち、ステップＳ３において、モード判定部１６は、当該マクロブロックのインター動きベクトルのｘ方向成分、ｙ方向成分をそれぞれｍｖ_１６×_１６＿ｘ，ｍｖ_１６×_１６＿ｙとし、インター４Ｖ動きベクトルのｘ方向、ｙ方向成分をそれぞれｍｖ_８×_{８＿ｘ，ｉ}，ｍｖ_８×_{８＿ｙ．ｉ}（ｉ＝１，２，３，４）として、動きベクトル情報の分散値Ｄｉｓｔを下記式（１２）若しくは式（１３）により算出する。
【００４５】
【数１】

【００４６】
【数２】

【００４７】
続いて、ステップＳ４において、上記式（１２）若しくは式（１３）で求められた分散値（Distribution：Dist）及び求められた閾値Ｔｈに対して、以下の式（１４）が成立するとき、ステップＳ５において、このマクロブロックをインターマクロブロックであるとし、成立しないとき、ステップＳ７においてインター４Ｖマクロブロックであると決定する。
Ｄｉｓｔ≦Ｔｈ（１４）
したがって、画像情報変換装置１は、上述したように、式（１）、式（２）及び式（６）で求めたような絶対値誤差和（ＳＡＤ）を計算する必要がないため、画質劣化を最小限に抑えながら演算量を大幅に削減することが可能である。
【００４８】
続いて、図４に本発明の第２の実施の形態である画像情報変換装置２を示す。画像情報変換装置２は、基本構造を図１に示した画像情報変換装置１と同様とするが、画像情報変換装置１では、動きベクトル検出部２０において高精度化されたインター動きベクトル及びインター４Ｖ動きベクトルを用いてＩｎｔｅｒ／Ｉｎｔｅｒ４Ｖの判定を行うのに対し、図４に示される画像情報変換装置２では、動きベクトル合成部３１において生成されたインター動きベクトル及びインター４Ｖ動きベクトルを用いて、Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖの判定を行う点に特徴を有している。図２に示す画像情報変換装置２では、図１に示した画像情報変換装置１と同様の構成については、同一符号を付してある。
【００４９】
図４に示した画像情報変換装置２においては、動きベクトル検出部３２において生成されたインター動きベクトル及びインター４Ｖ動きベクトルの高精度化を行うのに先立ち、Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部３４においてＩｎｔｅｒ／Ｉｎｔｅｒ４Ｖの判定を行うことが可能である。
【００５０】
このため、例えば、Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部３４において、マクロブロックがＩｎｔｅｒと判定された場合には、動きベクトル検出部３２においてインター動きベクトルに対してのみ高精度化を行い、インター４Ｖ動きベクトルを高精度化する必要はない。これにより動きベクトル検出に伴う演算量の削減を実現することが可能となる。
【００５１】
なお、以上の説明では、入力としてＭＰＥＧ２ビットストリームを例にあげ、出力としてＭＰＥＧ４ビットストリームを対象とした例をあげたが、入力、出力ともこれに限らず、例えばＭＰＥＧ１やＨ．２６３などの画像圧縮情報（ビットストリーム）でもよい。
【００５２】
【発明の効果】
以上説明したように、本発明に係る画像情報変換装置は、第２の画像符号化情報を構成する複数画像からなる各符号化単位に対応する動きベクトル情報を生成する動きベクトル生成手段と、生成された動きベクトル情報を格納する動きベクトル格納手段と、第１の画像符号化情報から抽出される動きベクトル情報に基づいて生成された第１の画像間予測符号化モードの符号化単位の動きベクトルと第２の画像間予測符号化モードの符号化単位の動きベクトルとに基づいて、第１の画像間予測符号化モードと第２の画像間予測符号化モードの何れを使用するかを決定する符号化モード判定を行うモード判定手段とを有する。
【００５３】
したがって、本発明に係る画像情報変換装置によれば、画質劣化を最小限に抑えながら動きベクトル検出に伴う演算量の大幅な削減を実現することが可能となる。
【００５４】
また、本発明に係る画像情報変換方法は、第２の画像符号化情報を構成する複数画像からなる各符号化単位に対応する動きベクトル情報を生成する動きベクトル生成工程と、生成された動きベクトル情報を動きベクトル格納手段に格納する動きベクトル格納工程と、第１の画像符号化情報から抽出される動きベクトル情報に基づいて生成された第１の画像間予測符号化モードの符号化単位の動きベクトルと第２の画像間予測符号化モードの符号化単位の動きベクトルとに基づいて、第１の画像間予測符号化モードと第２の画像間予測符号化モードの何れを使用するかを決定する符号化モード判定を行うモード判定工程とを有する。
【００５５】
したがって、本発明に係る画像情報変換方法によれば、画質劣化を最小限に抑えながら動きベクトル検出に伴う演算量の大幅な削減を実現することが可能となる。
【図面の簡単な説明】
【図１】本発明の実施の形態として示す画像情報変換装置の構成を説明する構造図である。
【図２】本発明の実施の形態の一構成例として示す画像情報変換装置の情報バッファ、Ｉｎｔｒａ／Ｉｎｔｅｒ判定部、Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部における動作を示すフローチャートである。
【図３】ＭＰＥＧ２ビットストリームを構成する画像ＳＴＲ２に含まれる４つのマクロブロックＭＢ_{ＭＰＥＧ２，ｉ}とＭＰＥＧ４ビットストリームを構成する画像ＳＴＲ４におけるマクロブロックＭＢ_{ＭＰＥＧ４，１}との対応を説明する説明図である。
【図４】本発明の第２の実施の形態として示す画像情報変換装置の構成を説明する構造図である。
【図５】従来の画像情報変換装置の構成を説明する構造図である。
【図６】従来の画像情報変換装置におけるＭＰＥＧ２画像情報符号化部でのモード判定の動作を説明するフローチャートである。
【図７】従来の画像情報変換装置における動きベクトル合成部及び動きベクトル検出部における動作を説明するフローチャートである。
【符号の説明】
１，２画像情報変換装置、１１入力端子、１２出力端子、１３ピクチャタイプ判別部、１４圧縮情報解析部、１５ＭＰＥＧ２画像情報復号化部、１６間引き部、１７ビデオメモリ、１８ＭＰＥＧ４画像情報符号化部、１９，３１動きベクトル合成部、２０，３２動きベクトル検出部、２１，３３Ｉｎｔｒａ／Ｉｎｔｅｒ判定部、２２，３４Ｉｎｔｅｒ／Ｉｎｔｅｒ４Ｖ判定部、２３情報バッファ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image information conversion apparatus and an image information conversion method, and more particularly, to image information (bitstream) compressed by orthogonal transformation such as discrete cosine transformation and motion compensation, for networks such as satellite broadcasting, cable TV, and the Internet. The present invention relates to an image information conversion apparatus and an image information conversion method that are used when receiving via a storage medium or processing on a storage medium such as an optical disk, a magnetic disk, or a flash memory.
[0002]
[Prior art]
In recent years, when image information is handled as digital data, for example, a discrete cosine transform (hereinafter referred to as DCT) is used for the purpose of transmitting and storing information with high efficiency by using redundancy unique to image information. Devices that comply with a method of compressing by orthogonal transformation and motion compensation, etc. are becoming widespread in both information distribution such as broadcasting stations and information reception in general households.
[0003]
In particular, MPEG2 standardized by the Moving Picture Experts Group (MPEG) is defined in ISO / IEC 13818-2 as a general-purpose image encoding method, and both interlaced scanning images and progressive scanning images, as well as standard resolution images and It covers high-definition images. Therefore, MPEG2 is expected to be used in a wide range of applications from professional use to consumer use.
[0004]
By using such an MPEG2 compression method, for example, in the case of a standard resolution interlaced scanned image having 720 × 480 pixels, a code amount of 4 to 8 Mbps (hereinafter referred to as a bit rate) has 1920 × 1088 pixels. In the case of a high-resolution interlaced scanned image, a high compression rate and good image quality can be realized by assigning a bit rate of 18 to 22 Mbps.
[0005]
MPEG2 was mainly intended for high-quality encoding suitable for broadcasting, but did not support a lower bit rate than MPEG1, that is, a higher compression rate encoding method. However, with the widespread use of portable terminals, the need for an encoding method with a higher compression rate is expected to increase in the future, so the standardization of the MPEG4 encoding method has been carried out. As for the image encoding method, ISO / Approved to international standards as IEC 14496-2.
[0006]
In this way, MPEG2 image compression information (hereinafter referred to as MPEG2 bitstream) once encoded so as to correspond to digital broadcasting is processed by, for example, a portable terminal or the like, so that an MPEG4 image with a lower bit rate is used. As an image information conversion device for converting into compressed information (hereinafter referred to as MPEG4 bitstream), the device shown in FIG. 5 is Field-to-Frame Transcoding with Spatial and Temporal Downsampling (Susie J.wee, John G. Apostolopoulos). , and Nick Feamster, ICIP'99) (Hereinafter referred to as “Document 1”) Has been proposed.
[0007]
In the image information conversion apparatus 100 shown in FIG. 5, the data of each frame in the interlaced MPEG2 bit stream supplied to the input terminal 110 is first input to the picture type determination unit 111.
[0008]
In the picture type discriminating unit 111, the input data of each frame relates to an I picture (intra-encoded image) and a P picture (forward predictive encoded image) or a B picture (bidirectional predictive encoded image). Only in the former case, information regarding the I and P pictures is output to the subsequent MPEG2 image information decoding unit (I / P picture) 112.
[0009]
The processing in the MPEG2 image information decoding unit 112 is the same as that of a normal MPEG2 image information decoding device. However, since the data regarding the B picture is discarded in the picture type determination unit 111, the MPEG2 image information decoding unit 112 only needs to be able to decode an I / P picture. The pixel value output from the MPEG2 image information decoding unit 112 is input to the thinning unit 113.
[0010]
The thinning unit 113 performs 1/2 thinning processing in the horizontal direction, leaves only data in either the first field or the second field in the vertical direction, and discards the other to A progressively scanned image having a size of 1/4 of the image information is generated. The progressively scanned image generated by the thinning unit 113 is temporarily stored in the video memory 114 and then read out and input to the MPEG4 image information encoding unit (I / P-VOP) 115.
[0011]
Here, for example, when the input MPEG2 bit stream is based on the NTSC (National Television System Committee) standard, that is, a 720 × 480 pixel, 30 Hz interlaced scanned image, the image frame after the thinning is 360. × 240 pixels, but when performing encoding in the subsequent MPEG2 image information encoding unit 115, in order to perform processing in units of macroblocks, the number of pixels is a multiple of 16 in both the horizontal and vertical directions. There must be. Therefore, the thinning unit 113 simultaneously performs pixel compensation or discarding for this purpose. That is, the thinning unit 113 at this time, for example, discards 8 lines at the right end or the left end in the horizontal direction to make up 352 × 240 pixels as the above-described pixel compensation or discard.
[0012]
The MPEG2 image information encoding unit 115 encodes the input progressively scanned image signal to generate an MPEG4 bit stream, and the MPEG4 image compression information is output from the output terminal 118 to the subsequent stage. At that time, the motion vector information in the input MPEG2 bit stream is mapped to the motion vector for the image information after the thinning by the motion vector synthesizing unit 116, and the motion vector detecting unit 117 synthesizes the motion vector information in the motion vector synthesizing unit 116. A highly accurate motion vector is detected based on the obtained motion vector value. In MPEG4, VOP (Video Object Plane) represents an area composed of one or a plurality of macroblocks surrounding an object, and corresponds to a frame in MPEG2. The VOP area is classified into one of an I picture, a P picture, and a B picture according to the encoding method. An I-VOP (IOP VOP) is an image (region) itself encoded (intra-coded) without performing motion compensation. P-VOP (VOP of P picture) is basically forward-predictive coded based on an image (I or P-VOP) located temporally before itself. A B-VOP (VOP of a B picture) is basically a bi-directional predictive coding based on two images (I or P-VOP) positioned before and after the time.
[0013]
As described above, Document 1 describes a technique related to an apparatus that generates an MPEG4 bit stream of a progressively scanned image having a size of 1/2 × 1/2 of an input MPEG2 bitstream. That is, when the input MPEG2 bit stream conforms to, for example, the NTSC standard, the output MPEG4 image compression information is the SIF size (352 × 240). For example, by changing the operation of the thinning unit 113, an image frame other than that, for example, an image of QSIF (176 × 112 pixels) size, which is an image frame of about 1/4 × 1/4 in the above example, is obtained. It is also possible to convert.
[0014]
Also, in Document 1, as a process in the MPEG2 image information decoding unit 112, a decoding process using all the eighth-order discrete cosine transform coefficients in the input MPEG2 bitstream is performed in each of the horizontal direction and the vertical direction. Although the apparatus is described, the apparatus shown in FIG. 5 is not limited to this, and only the low frequency component of the eighth-order discrete cosine transform coefficient is used only in the horizontal direction or in both the horizontal and vertical directions. It is possible to reduce the amount of calculation and the video memory capacity associated with the decoding process while performing the decoding process and minimizing image quality degradation.
[0015]
In the image information conversion apparatus shown in FIG. 5, when the MPEG-2 image information encoding unit 115 performs P-VOP encoding, each macroblock is encoded as an intra macroblock defined in MPEG4. Encoding mode type determination needs to be performed, whether encoding is performed as a 16 × 16 pixel inter macroblock, or encoding is performed as an 8 × 8 pixel inter 4V macro block. There is.
[0016]
Here, as a general method of mode determination, it is conceivable to use a method defined in MPEG-4 Video Verification Model (ISO / IEC JTC1 / SC29 / WG11 N2932, hereinafter referred to as Document 2).
[0017]
Hereinafter, the method of mode determination (mode determination performed in the MPEG2 image information encoding unit 115 in FIG. 5) described in Document 2 will be described with reference to FIG.
[0018]
First, in step S101, an inter motion vector and an inter 4V motion vector are obtained by motion vector detection.
[0019]
Next, in step S102, the prediction images generated by the inter motion vector and the inter 4V motion vector are respectively referred to as Ref. _Inter Ref _Inter4V And the original picture is represented by Org. Further, in step S103, the prediction error ERR is encoded when the macroblock is encoded as an inter macroblock and an inter 4V macroblock. _Inter , ERR _Inter4V Are calculated by Expression (1) and Expression (2), respectively, and the prediction error ERR when the average value of pixels included in the macroblock is Mean_MB and the macroblock is encoded as an intra macroblock is calculated. _Intra Is defined as in equation (3).
ERR _Inter = SAD (Org-Ref _Inter (1)
ERR _Inter4V = SAD (Org-Ref _Inter4V (2)
ERR _Intra = SAD (Org-Mean_MB) (3)
Note that SAD in the equation represents the sum of absolute difference.
[0020]
Next, as step S104, the prediction error ERR calculated by the above equations (1) and (2). _Inter , ERR _Inter4V Therefore, it is determined which encoding efficiency is better, that is, encoding as an inter macro block or encoding as an inter 4V macro block. That is, if Expression (4) is established, it is determined that the encoding efficiency is better when encoded as an inter macroblock, and if it is not satisfied, it is determined that the encoding efficiency is better when encoded as an inter 4V macroblock.
ERR _Inter -Offset <ERR _Inter4V (4)
In Equation (4), Offset is a parameter for facilitating selection of an inter macroblock, and is defined as 129 in Document 2.
[0021]
Next, when the inter macroblock is selected according to the equation (4), the parameter ERR is defined as the equation (5), while when the inter 4V macroblock is selected, the parameter ERR is defined as the equation (6). Define as follows.
ERR = ERR _Inter (5)
ERR = ERR _Inter4V (6)
Next, the prediction error ERR when the average value of starvation included in the macroblock is Mean_MB and the macroblock is encoded as an intra macroblock. _Intra Is defined as in equation (7).
ERR _Intra = SAD (Org-Mean_MB) (7)
Next, the prediction error ERR defined by the parameter ERR and the equation (3). _Intra Therefore, it is determined which of the coding efficiency is higher when the macroblock is coded as an intra macroblock or when the macroblock mode selected according to the equation (4) is coded.
[0022]
That is, if Expression (8) holds, it is more efficient to encode as an intra macroblock, and if not, it is more efficient to encode in the macroblock mode selected by Expression (3). .
ERR _Intra <ERR (8)
That is, in step S105 of FIG. _Inter <ERR _Intra When the above is not established, it is assumed that it is more efficient to encode in the intra macroblock mode as step S108, and when it is established, it is assumed that it is more efficient to encode in the inter macroblock mode as step S107. In step S106, ERR _Inter4V <ERR _Intra If not, it is assumed that it is more efficient to encode in the intra macroblock mode as step S108, and if it is satisfied, it is assumed that it is more efficient to encode in the inter 4V macroblock mode as step S109.
[0023]
Various methods can be considered for the operation principle in the motion vector synthesizing unit 116 and the motion vector detecting unit 117 described above, but MPEG2 supplied to the input terminal 110 as in the image information converting apparatus 2 shown in FIG. In the case of outputting an MPEG4 bit stream having a 1/2 × 1/2 picture frame of the bit stream, for example, it is conceivable to synthesize and detect motion vectors according to the flow shown in FIG.
[0024]
That is, in step S111, the motion vector synthesis unit 116 first extracts motion vector information from the input MPEG2 bit stream, and then in step S112, scaling of motion vector information with respect to the extracted MPEG2 image compression information and By performing time correction, inter 4V motion vector information for the output MPEG4 bit stream is synthesized. Further, in step S113, the motion vector synthesis unit 116 sets the average value or the representative value of the generated inter 4V motion vector as the inter motion vector.
[0025]
Next, in step S114, the motion vector detection unit 117 searches several pixels around the inter motion vector and inter 4V motion vector generated by the motion vector synthesis unit 116, and the inter motion vector and inter 4V motion. Increase vector accuracy. The motion vector with high accuracy is sent to the MPEG4 image information encoding unit 115 shown in FIG.
[0026]
[Problems to be solved by the invention]
However, the macroblock mode discrimination method shown in FIG. 6 requires a large amount of calculation because motion compensation and absolute value error sum (SAD) calculation are performed.
[0027]
Therefore, the present invention has been made in view of such a situation, and an object thereof is to provide an image information conversion apparatus and an image information conversion method capable of realizing macroblock mode discrimination with a small amount of calculation. .
[0028]
[Means for Solving the Problems]
The present invention provides an image information conversion apparatus for converting first image encoded information including at least an intra-image encoded image and an inter-image predictive encoded image into second image encoded information. Motion vector generating means for generating motion vector information corresponding to each encoding unit consisting of a plurality of images constituting encoding information, motion vector storing means for storing the generated motion vector information, and the first image code A motion vector of a coding unit of the first inter-picture prediction coding mode and a motion vector of a coding unit of the second inter-picture prediction coding mode generated based on the motion vector information extracted from the coding information. Based on the first inter-picture predictive coding mode and the second inter-picture predictive coding mode. The mode determination means includes an x-direction component between a motion vector of a coding unit in the first inter-picture predictive coding mode and a motion vector of a coding unit in the second inter-picture predictive coding mode. The variance value calculated as the sum of the difference absolute value sum and the difference absolute value sum of the y-direction components, or the motion vector of the coding unit in the first inter-picture prediction coding mode and the second inter-picture prediction The variance value calculated as the maximum value of the sum of absolute differences of the x-direction components and the sum of absolute differences of the y-direction components between the motion vectors of the coding units in the coding mode is obtained, and the variance value is preset A coding mode for the coding unit constituting the second image coding information is determined based on the comparison with the threshold value. This solves the problems described above.
[0029]
In addition, the present invention provides an image information conversion method for converting first image encoded information including at least an intra-image encoded image and an inter-image predictive encoded image into second image encoded information. A motion vector generation step for generating motion vector information corresponding to each encoding unit comprising a plurality of images constituting the image encoding information, and a motion vector storage step for storing the generated motion vector information in a motion vector storage means; The motion vector of the coding unit of the first inter-picture prediction coding mode and the code of the second inter-picture prediction coding mode generated based on the motion vector information extracted from the first picture coding information A coding mode decision which determines which of the first inter-picture prediction coding mode and the second inter-picture prediction coding mode is used based on the motion vector of the coding unit. Yes and a mode determination step of performing In the mode determination step, an x-direction component between the motion vector of the coding unit in the first inter-picture prediction coding mode and the motion vector of the coding unit in the second inter-picture prediction coding mode The variance value calculated as the sum of the difference absolute value sum and the difference absolute value sum of the y-direction components, or the motion vector of the coding unit in the first inter-picture prediction coding mode and the second inter-picture prediction The variance value calculated as the maximum value of the difference absolute value sum of the x direction component and the sum of absolute difference values of the y direction component between the motion vector of the coding unit in the coding mode is obtained, and the variance value is set in advance A coding mode for the coding unit constituting the second image coding information is determined based on the comparison with the threshold value. This solves the above-described problem.
[0030]
DETAILED DESCRIPTION OF THE INVENTION
An image information conversion apparatus shown as an embodiment of the present invention is an MPEG2 image standardized by a Moving Picture Experts Group (MPEG) composed of at least an intra-picture coded picture (I picture) and an inter-picture predictive coded picture (B picture). An image information conversion apparatus for converting compressed information into MPEG4 image compression information, which is a first inter-picture predictive coding mode generated based on motion vector information extracted from MPEG2 image compression information Based on the motion vector of the coding unit of the mode and the motion vector of the coding unit of the inter 4V coding mode which is the second inter-picture predictive coding mode, either the inter coding mode or the inter 4V coding mode is selected. A macroblock having mode determining means for determining a coding mode for determining whether to use the macroblock The mode discrimination can be realized with a smaller calculation amount.
[0031]
Embodiments of the present invention will be described below with reference to the drawings. As shown in FIG. 1, the image information conversion apparatus 1 mainly includes a picture type determination unit 13, a compression information analysis unit 14, an MPEG2 image information decoding unit (I / P picture) 15, and a thinning unit 16. , Video memory 17, MPEG4 image information encoding unit (I / P-VOP) 18, motion vector synthesis unit 19, motion vector detection unit 20, Intra / Inter determination unit 21, and Inter / Inter4V determination unit 22. And an information buffer 23.
[0032]
In the image information conversion apparatus 1, MPEG2 image compression information (hereinafter referred to as MPEG2 bitstream) for interlace scanning supplied to the input terminal 11 is transmitted to the picture type determination unit 13.
[0033]
The picture type determination unit 13 determines the picture type. That is, for the MPEG2 bit stream from the input terminal 11, information regarding I and P pictures is output and sent to the compression information analysis unit 14, but information regarding B pictures is discarded. As a result, the frame rate is converted.
[0034]
The compression information analysis unit 14 extracts information related to the encoding of the MPEG2 bitstream by performing syntax analysis of the image compression information sent from the picture type determination unit 13, and information related to the encoding Is sent to the information buffer 23, the MPEG2 motion vector information is sent to the motion vector synthesis unit 19, and the compressed image information is sent to the MPEG2 image information decoding unit 15. Details of the information extracted by the compressed information analysis unit 14 will be described later.
[0035]
The MPEG2 image information decoding unit 15 is equivalent to that of the apparatus shown in FIG. Since the information about the B picture has already been discarded by the picture type determination unit 13 in the previous stage, the function of the MPEG2 image information decoding unit 10 can be any function that can decode only the information about the I and P pictures. Good. The pixel value that is output from the MPEG2 image information decoding unit 15 is input to the thinning unit 16.
[0036]
The thinning unit 16 performs a half thinning process in the horizontal direction, leaves only the data in either the first field or the second field in the vertical direction, and discards the other image to input an image. A progressively scanned image having a quarter of the information is generated. Here, for example, when the MPEG2 bit stream supplied to the input terminal 11 conforms to the NTSC (National Television System Committee) standard, that is, a 720 × 480 pixel, 30 Hz interlaced scanned image, the image after the thinning is performed. The frame is 360 × 240 pixels. However, when encoding is performed in the subsequent MPEG2 image information encoding unit 18, the number of pixels needs to be a multiple of 16 in both the horizontal direction and the vertical direction in order to perform processing in units of macroblocks. Therefore, the thinning unit 16 simultaneously performs pixel compensation or discarding for making the number of pixels a multiple of 16, simultaneously with the thinning. That is, in this example, the thinning unit 16 discards, for example, 8 lines at the right end or left end in the horizontal direction with respect to the 360 × 240 pixels, for example, so that an image of 352 × 240 pixels that is a multiple of 16 Configure the frame. The progressively scanned image generated by the thinning unit 16 is temporarily stored in the video memory 17 and then read in response to a request from the subsequent MPEG2 image information encoding unit 18 or an intra / inter determination unit 21 described later. .
[0037]
Further, the motion vector synthesis unit 19 maps the MPEG2 motion vector information extracted from the MPEG2 bit stream to a motion vector for the image information after the thinning, and the motion vector detection unit 20 in the next stage further maps the motion vector information in the motion vector synthesis unit 19. Based on the synthesized motion vector value and the image information stored in the video memory 17, a highly accurate motion vector is detected. The detected motion vector is sent to the MPEG2 image information encoding unit 18 and the Intra / Inter determination unit 21.
[0038]
In the MPEG2 image information encoding unit 18, information related to encoding of the MPEG2 bitstream held in the information buffer 23, motion vector information from the motion vector detection unit 20, and an Intra / Inter determination unit to be described later 21 and the Inter / Inter4V mode information obtained by the determination process in the Inter / Inter4V determination unit 22, the signal of the sequentially scanned image supplied from the video memory 17 is encoded to generate an MPEG4 bit stream. The MPEG4 bit stream is output from the output terminal 12 to the subsequent stage.
[0039]
Operations in the information buffer 23, the Intra / Inter determination unit 21, and the Inter / Inter4V determination unit 22 will be described with reference to FIGS. First, the operation principle of mode determination for a normal P-VOP other than a P-VOP converted from an I picture will be described with reference to the flowchart of FIG. 2 and FIG.
[0040]
The Intra / Inter determination unit 21 extracts information related to the encoding from the MPEG2 bit stream serving as an input stored in the information buffer 23 (step S1), and first, based on the extracted information related to encoding. Inter mode determination is performed (step S2).
[0041]
Here, when outputting an MPEG4 bit stream of progressive scanning having an image frame of about 1/2 × 1/2 of the MPEG2 bit stream of interlace scanning, for example, as shown in FIG. 3, it is supplied to the input terminal 11. 4 macroblocks MB included in the image STR2 constituting the MPEG2 bit stream _{MPEG2, i} (I = 1, 2, 3, 4) is the macroblock MB in the image STR4 constituting the MPEG4 bit stream _MPEG4,1 Let's consider the case where
[0042]
For example, as shown in FIG. 3, four macroblocks MB of the image STR2 in the input MPEG2 bit stream _{MPEG2, i} (I = 1, 2, 3, 4) is a macroblock MB in the output MPEG4 bit stream _{MPEG4, i} When corresponding to MB _{MPEG2, i} The number of macroblocks encoded as intra macroblocks in (i = 1, 2, 3, 4) is N _Intra , The number of macroblocks encoded as inter macroblocks is N _Inter If the following equation (9) holds, the macro block is determined as an intra macro block in step S2 and encoded. In step S6, the macro of the image STR4 constituting the MPEG4 bit stream to be output is output. The block is determined to be an intra macroblock.
N _Intra ≧ N _Inter (9)
When Equation (9) does not hold, that is, when the number of macroblocks encoded as intra macroblocks is smaller than the number of macroblocks encoded as inter macroblocks, this macroblock Is determined to be encoded as an inter macro block or an inter 4V macro block, and the process proceeds to step S3.
[0043]
Here, mode determination using the residual may be performed as in the above-described equation (8). Or macroblock MB _{MPEG2, i} The quantization scale for each _{MPEG2, i} (I = 1, 2, 3, 4), and the assigned code amount (number of bits) is B _{MPEG2, i} (I = 1, 2, 3, 4), the macroblock MB _{MPEG2, i} Complexity X against _{MPEG2, i} (I = 1, 2, 3, 4) may be calculated by the equation (10), and a mode that seems to have good coding efficiency may be selected using this complexity.
X _{MPEG2, i} = Q _{MPEG2, i} ・ B _{MPEG2, i} (10)
[0044]
Next, with respect to the macro block determined to be an inter macro block or an inter 4V macro block in step S2 of the intra / inter determination, the inter motion vector and inter 4V for the VOP stored in the motion vector detecting unit 15 are used. Inter / inter 4V is determined using the motion vector. That is, in step S3, the mode determination unit 16 sets the x direction component and the y direction component of the inter motion vector of the macroblock to mv. ₁₆ × _{16_x} , Mv ₁₆ × _{16_y} And the x direction and y direction components of the inter 4V motion vector are mv ₈ × _{8_x, i} , Mv ₈ × _{8_y. i} As (i = 1, 2, 3, 4), the variance value Dist of the motion vector information is calculated by the following formula (12) or formula (13).
[0045]
[Expression 1]

[0046]
[Expression 2]

[0047]
Subsequently, in step S4, when the following expression (14) is established for the dispersion value (Distribution: Dist) obtained by the above expression (12) or expression (13) and the obtained threshold value Th, step In S5, it is determined that this macro block is an inter macro block, and when it is not established, it is determined in step S7 that it is an inter 4V macro block.
Dist ≦ Th (14)
Accordingly, as described above, the image information conversion apparatus 1 does not need to calculate the absolute value error sum (SAD) as calculated by the equations (1), (2), and (6), and therefore the image quality deterioration It is possible to greatly reduce the amount of calculation while minimizing the above.
[0048]
Next, FIG. 4 shows an image information conversion apparatus 2 according to the second embodiment of the present invention. The image information conversion apparatus 2 has the same basic structure as that of the image information conversion apparatus 1 shown in FIG. While the Inter / Inter4V determination is performed using the motion vector, the image information conversion apparatus 2 illustrated in FIG. 4 uses the inter motion vector and the inter 4V motion vector generated by the motion vector synthesis unit 31 to perform the Inter / Inter4V determination. / Inter4V is characterized in that it is determined. In the image information conversion apparatus 2 shown in FIG. 2, the same components as those of the image information conversion apparatus 1 shown in FIG.
[0049]
In the image information conversion apparatus 2 shown in FIG. 4, the Inter / Inter4V determination unit 34 performs Inter / Inter4V determination prior to increasing the accuracy of the inter motion vector and the inter 4V motion vector generated by the motion vector detection unit 32. It is possible to make a determination.
[0050]
For this reason, for example, when the Inter / Inter4V determination unit 34 determines that the macroblock is Inter, the motion vector detection unit 32 performs high accuracy only on the inter motion vector, and increases the inter 4V motion vector. There is no need to be precise. As a result, it is possible to reduce the amount of calculation associated with motion vector detection.
[0051]
In the above description, an MPEG2 bit stream is taken as an example of input and an MPEG4 bit stream is taken as an output, but the input and output are not limited to this. It may be image compression information (bitstream) such as H.263.
[0052]
【The invention's effect】
As described above, the image information conversion apparatus according to the present invention includes a motion vector generation unit that generates motion vector information corresponding to each encoding unit including a plurality of images constituting the second image encoding information, A motion vector storage means for storing the motion vector information, and a motion vector in a coding unit of the first inter-picture predictive coding mode generated based on the motion vector information extracted from the first picture coding information And whether to use the first inter-picture prediction coding mode or the second inter-picture prediction coding mode is determined based on the motion vector of the coding unit in the second inter-picture prediction coding mode. Mode determining means for performing encoding mode determination.
[0053]
Therefore, according to the image information conversion apparatus of the present invention, it is possible to achieve a significant reduction in the amount of calculation associated with motion vector detection while minimizing image quality degradation.
[0054]
In addition, the image information conversion method according to the present invention includes a motion vector generation step of generating motion vector information corresponding to each encoding unit composed of a plurality of images constituting the second image encoding information, and the generated motion vector A motion vector storage step of storing information in the motion vector storage means, and a motion of the coding unit of the first inter-picture predictive coding mode generated based on the motion vector information extracted from the first picture coding information Determining whether to use the first inter-picture predictive coding mode or the second inter-picture predictive coding mode based on the vector and the motion vector of the coding unit of the second inter-picture predictive coding mode And a mode determination step for performing encoding mode determination.
[0055]
Therefore, according to the image information conversion method according to the present invention, it is possible to achieve a significant reduction in the amount of calculation associated with motion vector detection while minimizing image quality degradation.
[Brief description of the drawings]
FIG. 1 is a structural diagram illustrating a configuration of an image information conversion apparatus shown as an embodiment of the present invention.
FIG. 2 is a flowchart illustrating operations in an information buffer, an Intra / Inter determination unit, and an Inter / Inter4V determination unit of the image information conversion apparatus shown as one configuration example of the embodiment of the present invention.
FIG. 3 shows four macro blocks MB included in an image STR2 constituting an MPEG2 bit stream. _{MPEG2, i} And the macroblock MB in the image STR4 constituting the MPEG4 bitstream _MPEG4,1 It is explanatory drawing explaining a response | compatibility.
FIG. 4 is a structural diagram illustrating a configuration of an image information conversion apparatus shown as a second embodiment of the present invention.
FIG. 5 is a structural diagram illustrating a configuration of a conventional image information conversion apparatus.
FIG. 6 is a flowchart illustrating an operation of mode determination in an MPEG2 image information encoding unit in a conventional image information conversion apparatus.
FIG. 7 is a flowchart illustrating operations in a motion vector synthesis unit and a motion vector detection unit in a conventional image information conversion apparatus.
[Explanation of symbols]
1, 2 Image information conversion device, 11 input terminal, 12 output terminal, 13 picture type discrimination unit, 14 compression information analysis unit, 15 MPEG2 image information decoding unit, 16 decimation unit, 17 video memory, 18 MPEG4 image information encoding , 19, 31 motion vector synthesis unit, 20, 32 motion vector detection unit, 21, 33 Intra / Inter determination unit, 22, 34 Inter / Inter4V determination unit, 23 information buffer

Claims

In an image information conversion apparatus that converts first image encoded information including at least an intra-image encoded image and an inter-image predictive encoded image into second image encoded information,
Motion vector generating means for generating motion vector information corresponding to each encoding unit comprising a plurality of images constituting the second image encoding information;
Motion vector storage means for storing the generated motion vector information;
A motion vector of a coding unit in the first inter-picture prediction coding mode generated based on motion vector information extracted from the first picture coding information and a coding in the second inter-picture prediction coding mode Mode determination means for performing a coding mode determination for determining which of the first inter-picture prediction coding mode and the second inter-picture prediction coding mode to be used based on a unit motion vector; Yes, and
The mode determination means includes an x-direction component difference between a motion vector of a coding unit in the first inter-picture prediction coding mode and a motion vector of a coding unit in the second inter-picture prediction coding mode. The variance value calculated as the sum of the sum of absolute values and the sum of absolute differences of y-direction components, or the motion vector of the coding unit in the first inter-picture predictive coding mode and the second inter-picture predictive coding A variance value calculated as the maximum value of the sum of absolute differences of the x-direction components and the sum of absolute differences of the y-direction components between the motion vectors of the mode encoding units is obtained, and the variance value and a preset threshold value are obtained. An image information conversion apparatus that determines an encoding mode for the encoding unit that constitutes the second image encoding information based on the comparison with .

If the condition that the variance value is equal to or less than the threshold value is satisfied, the mode determination means sets the encoding mode of the encoding unit constituting the second image encoding information to the first inter-picture prediction encoding mode. If the condition that the variance value is larger than the threshold value is satisfied, the coding mode of the coding unit constituting the second image coding information is determined as the second inter-picture prediction coding mode. picture information converting apparatus according to claim 1, wherein you.

In an image information conversion method for converting first image encoded information including at least an intra-image encoded image and an inter-image prediction encoded image into second image encoded information,
A motion vector generation step of generating motion vector information corresponding to each encoding unit consisting of a plurality of images constituting the second image encoding information;
A motion vector storage step of storing the generated motion vector information in a motion vector storage means;
A motion vector of a coding unit in the first inter-picture prediction coding mode generated based on motion vector information extracted from the first picture coding information and a coding in the second inter-picture prediction coding mode A mode determination step for performing a coding mode determination for determining which of the first inter-picture predictive coding mode and the second inter-picture predictive coding mode to be used based on a unit motion vector; Yes, and
In the mode determination step, the difference in the x-direction component between the motion vector of the coding unit in the first inter-picture prediction coding mode and the motion vector of the coding unit in the second inter-picture prediction coding mode The variance value calculated as the sum of the sum of absolute values and the sum of absolute differences of y-direction components, or the motion vector of the coding unit in the first inter-picture predictive coding mode and the second inter-picture predictive coding A variance value calculated as the maximum value of the sum of absolute differences of the x-direction components and the sum of absolute differences of the y-direction components between the motion vectors of the mode encoding units is obtained, and the variance value and a preset threshold value are obtained. An image information conversion method for determining an encoding mode for the encoding unit constituting the second image encoding information based on the comparison with .

In the mode determination step, if the condition that the variance value is equal to or less than the threshold value is satisfied, the encoding mode of the encoding unit constituting the second image encoding information is changed to the first inter-picture prediction encoding mode. If the condition that the variance value is larger than the threshold value is satisfied, the coding mode of the coding unit constituting the second image coding information is determined as the second inter-picture prediction coding mode. picture information converting method according to claim 3, wherein you.