JP2004153573A

JP2004153573A - Device and method for detecting motion, device and method for determining adaptive orthogonal transformation mode, medium and program

Info

Publication number: JP2004153573A
Application number: JP2002316643A
Authority: JP
Inventors: Shinjiro Mizuno; 慎二郎水野; Shozo Fujii; 省造藤井; Hidemi Oka; 秀美岡
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2002-10-30
Filing date: 2002-10-30
Publication date: 2004-05-27

Abstract

<P>PROBLEM TO BE SOLVED: To improve encoding efficiency by selecting an appropriate orthogonal transformation mode and encoding an interlaced image signal in the encoding of the interlaced image signal. <P>SOLUTION: Macroblocks inputted from an input terminal 10 are divided into intra-frame orthogonal transformation blocks to be orthogonally transformed. The macroblocks inputted at the same time are also divided into intra-field orthogonal transformation blocks to be orthogonally transformed. Then, prescribed coefficient data are selected from among orthogonally transformed coefficient data and made to be absolute values in each mode. The maximum value is detected from among the absolute values to be a representative value in each mode. A mode determining means 17 weights representative values of the respective modes, compares the representative values, selects the smaller mode and outputs a mode determination result. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、インタレース画像信号の動きを判定する動き検出装置、動き検出方法、適応直交変換モード判定装置、適応直交変換モード判定方法、媒体、及びプログラムに関するものである。
【０００２】
【従来の技術】
近年、デジタルビデオ機器等の画像信号のデジタル信号処理では、限られた伝送レートでの記録／再生を行うため、高能率符号化の技術開発が盛んに行われている。この高能率符号化とは、画像信号の持つ冗長度を利用して、データ量を圧縮する技術である。高能率符号化方法としては、画像フレーム内を例えば８×８画素のブロック（以下、直交変換ブロック）に分割し、直交変換ブロック単位で直交変換を行い圧縮する方法等が用いられることが多い。直交変換により算出された係数データは、所定の量子化幅で量子化され、統計的に定められた可変長符号化を施され、圧縮効率を向上して記録されることになる。
【０００３】
通常、画像信号は冗長性を持っているので、直交変換後の係数データは、低域成分に大きな値が出現し高域成分ほど小さな値となりやすい。また、人間の視覚特性上、高域成分に発生する圧縮歪みは目立ちにくいため、高域成分ほど大きな量子化幅で量子化する。よって、量子化後の係数データは高域ほどゼロとなりやすい。この特徴を利用して、可変長符号化は、量子化後の係数データを低域から高域へ所定のスキャン順に並べ替えたデータ列に対し、ゼロの続く数とその直後に現れた非ゼロの係数の値との組み合わせを、統計的に求められた所定の符号に可変長符号化する。上述したように、高域ほどゼロの続く数が多くなりやすいので、割り当てる符号が削減でき、効率の良い符号化を行うことができる。
【０００４】
しかしインタレース画像信号において、撮像された物体が動いている場合や、パニング等のようにカメラを動かして物体を撮像した場合には、画像フレームを構成する２つのフィールドの時間差があるため、フィールド間で近接する画素の相関が弱くなる。このような動きのあるブロックでは、２つのフィールドを合わせてフレーム化した状態で直交変換を行う（以下フレーム内直交変換と呼ぶ）と、大きな高域成分が出現し、符号化効率が悪くなってしまう。つまり大きな高域係数は量子化後もゼロとならずに符号化されるため、発生符号量が多くなるからである。
【０００５】
このような場合には、フィールド別にブロック化して直交変換を行う（以下フィールド内直交変換と呼ぶ）方が、大きな高域成分が出現しないので符号化効率がよい。これは、フィールド内の相関の方がフィールド間の相関よりも強いことによる。また、動きのある部分はフレーム内直交変換を用いて符号化すると、量子化歪みの発生により動きを忠実に再現できなくなる恐れがあるため、フィールド内直交変換を用いる方が良い。
【０００６】
従って、フレーム化されたブロックに対して直交変換を施すモード（フレーム内直交変換）とフィールド別に分離されたブロックに対して直交変換を施すモード（フィールド内直交変換）とを各ブロック毎に切り替えながら符号化を行う。すなわち静止部分ではフレーム内直交変換を行い、動きのある部分ではフィールド内直交変換を行うようにすることが望ましい。また直交変換モードの切り替えは、各直交変換ブロック単位に行う場合や、複数の直交変換ブロックを集めたマクロブロック単位に行う場合などがある。
【０００７】
さて、上記のような直交変換モードの切り替えを行う場合、画像フレームの静止部分と動きのある部分とを的確に検出し、適切な直交変換モードを選択するモード判定方法を用いる必要がある（例えば、特許文献１、及び特許文献２参照。）。
【０００８】
従来のモード判定方法としては、フレーム化された直交変換ブロックまたはマクロブロックに対して、上下に近接した画素間（ライン間）の差分絶対値総和を計算し、所定の閾値と比較することによりモード判定を行うものがある。差分絶対値総和が閾値より小さい場合はフレーム内直交変換を、大きい場合はフィールド内直交変換を行う。すなわち、動きのある部分では上下に位置する画素間の差分が大きくなりやすく、設定した閾値より大きくなるので、動きを検出することができるという方法である。
【０００９】
また、他の従来のモード判定方法としては、各モード別に直交変換、量子化、可変長符号化を施し、発生符号量が小さい方のモードを採用し、そのモードで発生した符号を採用する方法がある。
【００１０】
このように、インターレース画像において、フレーム化されたブロックに対して直交変換を施すモード（フレーム内直交変換）とフィールド別に分離されたブロックに対して直交変換を施すモード（フィールド内直交変換）とを各ブロック毎に切り替えながら符号化を行う際などには、インターレース画像の各ブロックの動きを検出する必要がある。
【００１１】
【特許文献１】
特開平７−８７４４８号公報
【特許文献２】
特開平１０−１３６３６７号公報
【００１２】
【発明が解決しようとする課題】
しかしながら、このようにインターレース画像の各ブロックの動きを検出する際には、以下のような課題がある。
【００１３】
第１の従来の方法では、例えば静止部分であっても複雑な絵柄であったり、ノイズ量が多い場合は、ライン間の差分が大きくなり、差分絶対値総和が大きくなりやすい。従って、所定の閾値より大きくなることが多くなり、静止部分であっても動きがあるかのような判定をしてしまう。
【００１４】
しかし、静止部分では、フレーム内直交変換を行う方が符号化効率がよい。なぜなら、画素間の相関は、フレーム内の方がフィールド分離したときよりも強いからである。このような誤検出が起こる理由は、差分絶対値総和と閾値との比較という手法が、フレーム内で直交変換を行う場合とフィールド内で直交変換を行う場合とでどちらが効率がよいかを分析するものではなく、相関が一定以上弱くなったとき必ずフィールドモードを選ぶというものであるためである。
【００１５】
また、直交変換モード判定をマクロブロックに適用する場合は、マクロブロック内の部分的な動きを検出しにくい。つまり、マクロブロック内のある直交変換ブロックのみに動きがある場合、このブロックの差分絶対値は比較的大きいが、他の直交変換ブロックにおける差分絶対値は非常に小さいので、全体の差分絶対値総和は平均化され、動きのないブロックであるかのような判定をしてしまう。
【００１６】
従って、第１の従来の方法によってインターレース画像のブロックの動きを検出する際には、絵柄によっては適切な検出ができないという課題がある。
【００１７】
第２の従来の方法では、例えばハードウェアで実現した場合、直交変換、量子化、可変長符号化のための回路を各モード別に２組用いる必要があるため、回路規模の増大を招く。
【００１８】
また回路規模を増大せずに行う場合は、回路の動作周波数が２倍以上必要であり、電力消費量が増大してしまう。ソフトウェアで実現する場合でも演算量が非常に多くなってしまう。また、発生符号量のみでモード判定を行うので、動きの要素を含んでいない。従って、マクロブロック内の部分的な動きを考慮せずフレーム内直交変換を行ってしまう可能性がある。
【００１９】
すなわち、第２の従来の方法によってインターレース画像のブロックの動きを検出する際には、回路規模の増大を招き、また、回路規模を増大させずに行う場合には、電力消費量が増大するという課題がある。
【００２０】
本発明は、上記課題を考慮し、画像の複雑さやノイズ量の影響を受けにくく、インターレース画像の動きに応じて適切な動きの検出が可能な動き検出装置、動き検出方法、適応直交変換モード判定装置、適応直交変換モード判定方法、媒体、及びプログラムを提供することを目的とするものである。
【００２１】
また、本発明は、簡単な回路構成で実現可能な動き検出装置、動き検出方法、適応直交変換モード判定装置、適応直交変換モード判定方法、媒体、及びプログラムを提供することを目的とするものである。
【００２２】
【課題を解決するための手段】
上述した課題を解決するために、第１の本発明は、インターレース画像信号を分割した複数の画素ブロックそれぞれの動きを検出する動き検出装置であって、前記画素ブロックをｎ個（ｎ≧１）のフレーム内直交変換ブロックに分割するフレーム内直交変換ブロック生成手段（図１の１１）と、
分割された前記フレーム内直交変換ブロックそれぞれに動き検出用直交変換を行う第１の直交変換手段（図１の１３）と、
前記各直交変換ブロックからｍ個（ｍ≧１）の所定の係数データを抽出し、抽出された前記係数データのうちから１個を選択してフレーム代表値ｘとする第１の代表値検出手段（図１の１５）と、
前記画素ブロックを所定のｋ個（ｋ≧１）のフィールド内直交変換ブロックに分割するフィールド内直交変換ブロック生成手段（図１の１２）と、
分割された前記フィールド内直交変換ブロックそれぞれに動き検出用直交変換を行う第２の直交変換手段（図１の１４）と、
前記各直交変換ブロックからｓ個（ｓ≧１）の所定の係数データを抽出し、抽出された前記係数データのうちから１個を選択してフィールド代表値ｙとする第２の代表値検出手段（図１の１６）と、
前記フレーム代表値ｘと前記フィールド代表値ｙとを比較することによって前記画素ブロックの動きの有無を判定するモード判定手段（図１の１７）とを備えた動き検出装置である。
【００２３】
また、第２の本発明は、前記所定の係数データは、特定の垂直高城成分を示す係数である第１の本発明の動き検出装置である。
【００２４】
また、第３の本発明は、前記フレーム代表値ｘおよび前記フィールド代表値ｙは、それぞれの直交変換における、前記抽出された所定の係数データの絶対値のうちの最大値を選択したものである第１の本発明の動き検出装置である。
【００２５】
また、第４の本発明は、前記動きの有無を判定するとは、（１）フレーム代表値ｘおよびフィールド代表値ｙを直接比較する、もしくは、（２）両方または一方の代表値を所定の関数またはテーブル値で変換した値を比較して、フレーム代表値ｘの方が小さいときに動きなしと判定し、フィールド代表値ｙの方が小さいときに動きありと判定することである第３の本発明の動き検出装置である。
【００２６】
また、第５の本発明は、前記動きの有無を判定するとは、前記フレーム代表値ｘが所定の閾値以下の時、たとえ前記フレーム代表値ｘが前記フィールド代表値ｙよりも値が大きくても動きなしと判定することである第４の本発明の動き検出装置である。
【００２７】
また、第６の本発明は、前記各直交変換は、前記抽出すべき前記所定の係数データのみを演算する第１の本発明の動き検出装置である。
【００２８】
また、第７の本発明は、インタレース画像信号を複数の画素ブロックに分割し、前記画素ブロック毎に符号化する際、前記画素ブロック毎に画像符号化用フレーム直交変換または画像符号化用フィールド直交変換を切り換える制御を行う適応直交変換モード判定装置であって、
前記画素ブロックの動きの有無を判定する動き判定手段と、
（１）前記画素ブロックが動きありと判定された場合、その画素ブロックに画像符号化用フィールド直交変換を行うように制御し、（２）前記画素ブロックが動きなしと判定された場合、その画素ブロックに画像符号化用フレーム直交変換を施すように制御する制御手段（図１の１８）とを備え、
前記動き判定手段には、第１〜６の本発明のいずれかの動き検出装置が用いられている適応直交変換モード判定装置である。
【００２９】
また、第８の本発明は、インターレース画像信号を分割した複数の画素ブロックそれぞれの動きを検出する動き検出方法であって、
前記画素ブロックをｎ個（ｎ≧１）のフレーム内直交変換ブロックに分割するフレーム内直交変換ブロック生成ステップと、
分割された前記フレーム内直交変換ブロックそれぞれに動き検出用直交変換を行う第１の直交変換ステップと、
前記各直交変換ブロックからｍ個（ｍ≧１）の所定の係数データを抽出し、抽出された前記係数データのうちから１個を選択してフレーム代表値ｘとする第１の代表値検出ステップと、
前記画素ブロックを所定のｋ個（ｋ≧１）のフィールド内直交変換ブロックに分割するフィールド内直交変換ブロック生成ステップと、
分割された前記フィールド内直交変換ブロックそれぞれに動き検出用直交変換を行う第２の直交変換ステップと、
前記各直交変換ブロックからｓ個（ｓ≧１）の所定の係数データを抽出し、抽出された前記係数データのうちから１個を選択してフィールド代表値ｙとする第２の代表値検出ステップと、
前記フレーム代表値ｘと前記フィールド代表値ｙとを比較することによって前記画素ブロックの動きの有無を判定するモード判定ステップとを備えた動き検出方法である。
【００３０】
また、第９の本発明は、インタレース画像信号を複数の画素ブロックに分割し、前記画素ブロック毎に符号化する際、前記画素ブロック毎に画像符号化用フレーム直交変換または画像符号化用フィールド直交変換を切り換える制御を行う適応直交変換モード判定方法であって、
前記画素ブロックの動きの有無を判定する動き判定ステップと、
（１）前記画素ブロックが動きありと判定された場合、その画素ブロックに画像符号化用フィールド直交変換を行うように制御し、（２）前記画素ブロックが動きなしと判定された場合、その画素ブロックに画像符号化用フレーム直交変換を施すように制御する制御ステップとを備え、
前記動き判定ステップには、第８の本発明の動き検出方法が用いられている適応直交変換モード判定方法である。
【００３１】
また、第１０の本発明は、第１の本発明の動き検出装置の、前記画素ブロックをｎ個（ｎ≧１）のフレーム内直交変換ブロックに分割するフレーム内直交変換ブロック生成手段（図１の１１）と、
分割された前記フレーム内直交変換ブロックそれぞれに動き検出用直交変換を行う第１の直交変換手段（図１の１３）と、
前記各直交変換ブロックからｍ個（ｍ≧１）の所定の係数データを抽出し、抽出された前記係数データのうちから１個を選択してフレーム代表値ｘとする第１の代表値検出手段（図１の１５）と、
前記画素ブロックを所定のｋ個（ｋ≧１）のフィールド内直交変換ブロックに分割するフィールド内直交変換ブロック生成手段（図１の１２）と、
分割された前記フィールド内直交変換ブロックそれぞれに動き検出用直交変換を行う第２の直交変換手段（図１の１４）と、
前記各直交変換ブロックからｓ個（ｓ≧１）の所定の係数データを抽出し、抽出された前記係数データのうちから１個を選択してフィールド代表値ｙとする第２の代表値検出手段（図１の１６）と、
前記フレーム代表値ｘと前記フィールド代表値ｙとを比較することによって前記画素ブロックの動きの有無を判定するモード判定手段（図１の１７）としてコンピュータを機能させるためのプログラムである。
【００３２】
また、第１１の本発明は、第１０の本発明のプログラムを担持した媒体であって、コンピュータにより処理可能な媒体である。
【００３３】
本発明によれば、例えば画像の複雑さやノイズの重畳度による悪影響を受けにくく、画像の動きに応じて適切な動きの検出の判定を行うことができる。また、非常に簡単な構成で実現可能であり、回路規模や消費電力の増大を抑えることが可能な動きの検出の判定を行うことが出来る。
【００３４】
【発明の実施の形態】
以下、本発明の実施の形態を、図面を参照して説明する。
【００３５】
（実施の形態１）
以下、本発明の実施の形態１である適応直交変換モード判定方法を適用した画像符号化装置について説明する。図１は本実施形態における画像符号化装置の構成図である。
【００３６】
図１において、１０は入力端子であり、１１はフレーム内直交変換ブロック生成手段であり、１２はフィールド内直交変換ブロック生成手段であり、１３は第１の直交変換器であり、１４は第２の直交変換器であり、１５は第１の代表値検出手段であり、１６は第２の代表値検出手段であり、１７はモード判定手段であり、１８は選択器であり、１９は第３の直交変換器であり、２０は出力端子である。
【００３７】
なお、本実施の形態の第１の直交変換器１３は本発明の第１の直交変換手段の例であり、本実施の形態の第２の直交変換器１４は本発明の第２の直交変換手段の例であり、本実施の形態の選択器１８は本発明の制御手段の例である。
【００３８】
このように構成された画像符号化装置について、以下に動作を示す。
【００３９】
まず、インタレース画像信号が所定の画素ブロックに分割され、入力端子１０から入力される。
【００４０】
ここで、画素ブロックとは、フレーム内直交変換（フレーム化されたブロックに対して直交変換を施す）かフィールド内直交変換（フィールド別に分離されたブロックに対して直交変換を施す）かどちらの直交変換モードで直交変換を行うかを判定される単位であって、１個の直交変換ブロックである場合や、複数の直交変換ブロックを集めたマクロブロックである場合等がある。
【００４１】
この画素ブロックは、フレーム内直交変換ブロック生成手段１１とフィールド内直交変換ブロック生成手段１２へと入力される。
【００４２】
次に、フレーム内直交変換ブロック生成手段１１は、入力された画素ブロックをフレーム化された状態で分割し、フレーム内直交変換ブロックを生成する。また同様に、フィールド内直交変換ブロック生成手段１２は、入力された画素ブロックをフィールド分離して分割し、フィールド内直交変換ブロックを生成する。
【００４３】
図２に各直交変換ブロックのブロック化の様子を示す。まず、図２（ａ）は直交変換モードの判定を適用する画素ブロックであるマクロブロックを示している。この場合、マクロブロックは１６×１６画素で構成されていて、４個の直交変換ブロックに分割するものとしている。また、入力端子１０からは、インタレース画像信号がマクロブロックに分割されて入力されるので、マクロブロックは、２つのフィールドが合成されている。従って、図２（ａ）、（ｂ）、（ｃ）において、白の画素が偶数フィールド、斜線で塗られた画素が奇数フィールドであるとする。
【００４４】
このマクロブロックに対してフレーム内直交変換を行う場合、図２（ｂ）に示すようなフレーム内直交変換ブロック化を行い、直交変換する。すなわち、上述したフレーム構成のマクロブロックをフィールド合成された状態で直交変換単位である８×８画素の直交変換ブロックに分割する。
【００４５】
一方、フィールド内直交変換を行う場合、図２（ｃ）に示すようなフィールド内直交変換ブロック化を行い、直交変換する。すなわち、上述のマクロブロックをフィールド分離し、各フィールド毎に直交変換単位である８×８画素の直交変換ブロックに分割する。
【００４６】
次に、第１の直交変換器１３は、フレーム内直交変換ブロック生成手段１１から出力されたフレーム内直交変換ブロックを直交変換し、係数データを出力する。そして、第１の代表値検出手段１５は係数データのうち所定の係数データのみを抽出し、絶対値化する。所定の係数データは、あらかじめ決めておいた特定の周波数成分を示す係数データである。
【００４７】
図３に所定の係数データの一例を示す。図３は直交変換によって生成された係数データの模式図であって、８×８画素の直交変換ブロックに対して直交変換を行ったときに生成される６４個の係数データを示している。図３において、左上の係数データほど低域を示しており、右下の係数ほど高域を示す。また、最も左上の係数データは直流成分（ＤＣ）を示している。また図３において、Ａと記された係数データは、垂直高域成分を示す係数データであり、これを所定の係数データとして選ぶ。
【００４８】
従って、第１の代表値検出手段１５は、マクロブロック内に存在する４個のフレーム内直交変換ブロック毎にＡの位置の係数データを抽出し、それぞれ絶対値化する。そして各絶対値のうちの最大値を検出し、フレームモード代表値ｘとして出力する。
【００４９】
同様に、第２の直交変換器１３は、フィールド内直交変換ブロック生成手段１２から出力されたフィールド内直交変換ブロックを直交変換し、係数データを出力する。そして、第２の代表値検出手段１６は係数データのうち所定の係数データのみを抽出し、絶対値化する。このときの所定の係数データも、図３に示したＡの位置の係数データである。
【００５０】
従って、第２の代表値検出手段１６は、マクロブロック内に存在する４個のフィールド内直交変換ブロック毎にＡの位置の係数データを抽出し、それぞれ絶対値化する。そして、各絶対値のうちの最大値を検出し、フィールドモード代表値ｙとして出力する。
【００５１】
次にモード判定手段１７は、生成されたフレームモード代表値ｘとフィールドモード代表値ｙとを重み付け比較する。まず、簡単のために重みを１とした場合の比較を説明する。この場合、フレームモード代表値ｘとフィールドモード代表値ｙとをそのまま比較し、小さい方のモードを選択し、判定結果とする。例えば、代表値ｙが代表値ｘより小さかったら、フィールドモードを選択する。
【００５２】
この判定動作の概念図を図４に示す。図４において、ｘ−ｙ平面上にｙ＝ｘなる関数で表される境界線によってフレームモード判定エリアとフィールドモード判定エリアが分別されている。そしてモード判定手段１７は、各マクロブロックにおける代表値（ｘ，ｙ）の点をプロットし、プロットされたエリアのモードを判定結果として選ぶように動作する。
【００５３】
例えば、図４に示されているＳ点は、あるマクロブロックの代表値（ｘ，ｙ）をプロットした点である。この場合、フィールド判定エリアに属するのでフィールドモードと判定される。
【００５４】
動きのある部分を含むマクロブロックの代表値（ｘ，ｙ）の点はｘ−ｙ平面上のｘ軸付近に偏って出現する。また逆に動きのある部分を含まないマクロブロックの代表値（ｘ，ｙ）の点は、ｙ軸付近に偏って出現する。従って、動きのある部分と無い部分とを分類することができる。このようにマクロブロックの代表値（ｘ、ｙ）を比較することにより、マクロブロックの動きを検出することが出来る。
【００５５】
このようにモード判定手段１７によって判定されたモード情報は選択器１８に与えられる。また選択器１８には、このモード判定を適用されるべき上述のフレーム内直交変換ブロックデータ及びフィールド内直交変換ブロックデータが入力されている。そして、モード判定手段１７からのモード情報によりどちらか一方の直交変換ブロックデータが選択される。すなわち、モード判定手段１７は、フレーム代表値ｘとフィールド代表値ｙとを比較することによってマクロブロックの動きの有無を判定することが出来る。
【００５６】
選択器１８から出力される直交変換ブロックデータを第３の直交変換器１９が直交変換する。そして直交変換された係数データは出力端子２０から出力される。
【００５７】
さて、上述の所定の係数データの決め方の説明では、各直交変換ブロックのＡと記した１個の垂直高域係数のみを選んだが、更に選ぶ係数を増やせばもっと精度良くモード判定できる。
【００５８】
図５は所定の係数データの決め方を説明するための図である。以下、図５を参照して説明する。すなわち、図５はフレーム内直交変換ブロックの例を示すものである。
【００５９】
まず、フレーム内直交変換における係数Ａは、垂直高域成分を示す係数データであるので、物体が動くことによって、フィールド間の相関が弱くなるときに大きな値となる。つまり、図５（ａ）に記すように、フレーム内直交変換ブロックが縞模様となるからである。この場合はフィールド分離した方が相関が強くなり、フィールドモード直交変換を行うと係数Ａは小さくなる。従って、フィールドモードを選択することができる。
【００６０】
しかし、物体のエッジの種類、動き方、速度などによって、ブロック内に含まれる模様はいろいろなパターンがある。例えば図５（ｂ）で示すような縞模様となったとき、フレーム内直交変換を行っても係数Ａは大きくなりにくいので、誤ったモード判定をしてしまう可能性がある。しかし、実際は係数Ａの隣の係数が大きくなっている。よって、所定の係数として係数Ａに加えてその隣の係数も選択するようにすれば、より正確なモード判定を行うことができる。
【００６１】
図６はこのように選択した係数データを示す図であり、上述のように係数Ａとその隣の係数Ｂが、所定の係数データとして選択されていることを示す。また実際、図５（ｂ）のような模様は、細いエッジが速く動いた場合などによく発生する。このように、画像として発生しやすいパターンに基づいて、特定の複数の係数データを選べば、より正確なモード判定が可能となる。
【００６２】
次に、重み付け比較について更に説明する。上述では、重みが１、つまり重みの無い状態での比較を示したが、重み付け比較を行うことで、より符号化効率を向上させるモード判定を行うことができる。重み付けとは、上述した境界線の引き方を変えることであって、フレームとフィールドのモード判定エリアの範囲に偏りを持たせることと同じである。
【００６３】
図７は、ある境界線を設定してエリア分けした図である。この場合は、各代表値（ｘ、ｙ）が似たような値の場合、フレーム判定に入りやすい設定にしたものである。なお、重みの付け方、すなわち境界線の設定方法は任意に決めることができる。適当な関数で表しても良いし、テーブル値によって表しても良い。また、関数は線形であっても非線形であっても良い。
【００６４】
このようにモード判定エリアに重み付けを行って判定することは、例えば代表値ｘをその関数またはテーブルに代入し、値を変換したものと代表値ｙとを比較し、小さい方のモードを選ぶことと等しい。また、例えば代表値ｙをその関数またはテーブルに代入し、値を変換したものと代表値ｘとを比較し、小さい方のモードを選ぶことと等しい。また、例えば代表値ｘと代表値ｙとをそれぞれの関数またはテーブルに代入し、値を変換したものどうしを比較し、小さい方のモードを選ぶことと等しい。
【００６５】
物体の動き量が大きいときや物体が静止しているときには、各代表値の差が大きくなりやすく、上述のように各軸に偏って出現しやすいので、的確にモード判定されやすい。しかし、もともとの物体が複雑であったりノイズが重畳されているような場合は、大きく動いている場合を除いて、各代表値の差が小さくなるのでどちらのモードが選択されるか分からない。
【００６６】
この場合は、できるだけフレームに入るようにする方が良い。なぜなら、もともと動き量がそれほど大きくないので、フィールド別よりもフレーム化した方が相関が強い場合が多いからである。よって、代表値の差があまり大きくない場合にはフレーム内直交変換が選ばれるように重みを付けて比較することによって、さらに適切なモード判定を行うことができる。
【００６７】
以上の説明のように、本発明の実施の形態１では動きの指標となる垂直高域係数をあらかじめ求め、小さくなる方の直交変換モードを選択するようにしている。また、マクロブロック内の４個の直交変換ブロックのそれぞれの垂直高域係数を比較し、最大値を求め、そのモードの代表値とすることで、マクロブロック内の動きのある部分を適切に検出することができる。また、各モード毎に求められた代表値を比較することによってモード判定を行うので、ノイズの影響を受けにくい。画像に重畳されるホワイトノイズは各ノイズ粒子間に相関がないので、どちらのモードで直交変換を行っても特徴的な係数分布をしないと考えられる。そのため、どちらのモードでもノイズによる大きな垂直高域係数が出現しやすくなるが、もともとの画像の動きの特徴が隠されてしまうほど大きくはない。従って、各代表値を相対比較することによって、ノイズの悪影響によるモード判定ミスを低減することができる。さらに重み付け比較により動きのない部分でのノイズによる判定誤りを低減することができる。
【００６８】
よって、本発明の実施の形態１によれば、画像の動きによって符号化効率が悪化しないように適切な直交変換モード判定を行うことができる。さらに、画像の複雑度やノイズの重畳度による悪影響を受けにくい直交変換モード判定を行うことができる。
【００６９】
（実施の形態２）
以下、本発明の実施の形態２である適応直交変換モード判定方法を適用した画像符号化装置について動作説明する。本実施の形態における画像符号化装置は第１の実施の形態における画像符号化装置の構成図と同じである。但し、モード判定手段１７の動作が異なる。以下、モード判定手段の動作について説明する。なお、同一機能を有する他の構成要素の説明は省略する。
【００７０】
モード判定手段１７には、第１の代表値検出手段１５から出力されるフレームモード代表値ｘおよび第２の代表値検出手段１６から出力されるフィールドモード代表値ｙが入力されている。モード判定手段１７は、実施の形態１で説明したように各代表値（ｘ，ｙ）を重み付け比較して小さい方のモードを選択するものであるが、なおかつフレームモード代表値ｘとあらかじめ設定した閾値Ｔとを比較して、フレームモード代表値ｘが閾値Ｔより小さいときに、必ずフレームモードを選択するようにモード判定を行うものである。
【００７１】
図８に、このモード判定方法の概念図を示す。図８では、代表値（ｘ，ｙ）を判定するｘ−ｙ平面上において、重み付けを定義する所定の境界線によってフレーム判定エリアとフィールド判定エリアとが分かれているが、特にフレームモード代表値ｘがある閾値Ｔより小さいエリアは全てフレーム判定エリアとなるようにしている。
【００７２】
画像信号において、特に平坦な静止部分では、フレーム内直交変換およびフィールド内直交変換を行っても、どちらも高域成分に大きな値が発生しにくい。従って、値の小さな代表値同士を比較することになる。この場合の代表値の差は非常に小さく、類似した画像を含むブロック同士であってもモード判定結果が頻繁に変わってしまうことが起こり得る。
【００７３】
つまり、近接したブロックや近接フレームでの同じ位置のブロックでモード判定が切り替わると、圧縮歪みの出現形態が変わるので、歪みが目立つことになる。また、フレームモード代表値ｘが小さいときはほとんどが静止部分であるので、フレーム内直交変換を行う方が、画素間の相関が強くなり符号化効率が上がる。
【００７４】
従って上述のように、閾値を設けた判定を行うことによって、フレームモード代表値ｘが小さいときは必ずフレームモードが判定されるようになり、符号化効率が向上する。
【００７５】
以上のように、本発明の実施の形態２によれば、画像の静止部分での誤検出を低減し、より正確な直交変換モード判定を行うことが出来る。
【００７６】
（実施の形態３）
以下、本発明の実施の形態３である適応直交変換モード判定方法を適用した画像符号化装置について動作説明する。本実施の形態における画像符号化装置は第１の実施の形態における画像符号化装置の構成図と同じである。但し、第１の直交変換器１３と第２の直交変換器１４の動作内容が異なる。以下、各直交変換器１３，１４の動作について説明する。なお、同一機能を有する他の構成要素の説明は省略する。
【００７７】
第１の直交変換器１３および第２の直交変換器１４は、実際に符号化に用いる第３の直交変換器１９と同じ直交変換を行うものであるから、第３の直交変換器１９と同一な構成であっても良い。しかし、第１の直交変換器１３および第２の直交変換器１４によって算出された係数データのうち、所定の係数データ以外はモード判定に使用されない。
【００７８】
従って、実施の形態１で説明したように、所定の係数データとして１個または数個だけを選んで判定を行うときは、それらの係数だけを算出する構成とする。これによって、第１の直交変換器１３および第２の直交変換器１４は、第３の直交変換器１９に比べて格段に簡単な構成となる。そして、第１の代表値検出手段１５と第２の代表値検出手段１６は、生成される係数データのみを用いてそれぞれ最大値検出し、代表値を決定する。
【００７９】
さらに、第１の直交変換器１３および第２の直交変換器１４は、実際に直交変換されたときどちらのモードを行った方が有利であるかをあらかじめ計算し、分析するためのものである。そして、第１の直交変換器１３および第２の直交変換器１４の演算結果はモード判定のためだけに使用され、実際に符号化されるデータとはならない。
【００８０】
従って、第１の直交変換器１３および第２の直交変換器１４は、実際に符号化に用いる第３の直交変換器１９における直交変換演算式よりも簡略化された直交変換式を用いて演算するように構成する。
【００８１】
一般に符号化に用いる直交変換は非常に高精度な演算が要求される。これは、演算誤差によって歪みが生まれることを防止するためである。このような直交変換器を回路で実現すると非常に大規模になってしまう。高精度な直交変換式は、ブロック化された画素データに対してコサイン関数で表された所定の変換係数を乗じて、加算するという行列演算である。そして、この変換係数は小数で表されるものである。通常は演算精度を確保するため大規模な乗算器を用いた演算を行うことになる。
【００８２】
しかし、第１の直交変換器１３および第２の直交変換器１４で用いる簡略化された直交変換式では、各変換係数を特殊な小数に近似して用いる。この特殊な小数とは、１／（２ｎ）で表される小数およびそれらの加算の組み合わせで算出可能な小数である。この近似変換係数を適用すると、上述した乗算がビットシフトおよび加算で実現できるため、大規模な乗算器が不要となる。また、ソフトウェアで行う場合には、演算量が減る。なお、用いる近似変換係数の精度は任意に決めることができる。
【００８３】
このように簡略化された直交変換式を用いた場合、演算後の係数データは一定の誤差を含むが、先にも述べたように画像の特徴に応じて代表値（ｘ，ｙ）の分布に偏りがあるため、高精度な演算式を用いた場合とほぼ同等なモード判定を行うことができる。
【００８４】
以上のように、本発明の実施の形態３によれば、モード判定に用いる直交変換器の構成を簡略化することができるので、非常に小規模な回路構成で、実施の形態１または２で説明した正確な適応直交変換モード判定を行うことができる。
【００８５】
なお、上記の全ての実施形態の説明において、マクロブロック単位でモード判定を行う場合を示したが、各直交変換ブロック毎にモード判定を行う場合も同様である。例えば、８×８画素のフレーム化された直交変換ブロックに対して、フレーム内直交変換を行うときはそのままの構造で直交変換を行い、フィールド内直交変換を行うときはフィールド分離された２組の８×４画素ブロック毎に直交変換を行うような場合がある。このときも、フレームモード代表値ｘは１つのブロックから得られる所定の係数データのうちの最大値とすれば良く、フィールドモード代表値ｙは２つのブロックから得られる所定の係数データのうちの最大値とすればよい。
【００８６】
なお、上記の全ての実施形態の説明では、特定の垂直高域係数を用いたが、これのみに限定されるものではなく、どの係数を用いるか、および何個用いるか等は任意に設定可能である。
【００８７】
なお、上記の全ての実施形態の説明では、入力端子１０に画像信号が入力される場合を説明したが、たとえば動き補償予測を併用した場合には、差分信号が入力される場合もある。この場合も、差分信号のマクロブロックに対して本発明を適用することが可能である。
【００８８】
なお、上記の全ての実施形態の説明では、本発明を適用した画像符号化装置の構成の一例を示したが、同一の作用を有する構成であれば、他の構成であっても構わない。また、ハードウェアに関して説明したが、ソフトウェアに本発明を適用することも可能である。
【００８９】
なお、本実施の形態では、適応直交変換モード判定方法を適用した画像符号化装置について説明したが、これに限らず、インターレース画像から静止画を作成する際にも本発明の動き検出装置や動き検出方法を用いることが出来る。
【００９０】
すなわち、ビデオカメラで撮影された画像信号は通常インターレース信号であるが、ビデオカメラで撮影されたインターレース信号から静止画を作成する機能を有するビデオカメラがある。このようなビデオカメラにおいて、静止画を作成する際、上記各実施の形態で説明したように、各ブロック単位について動きの有無を検出する。そして、動いていると判定されたブロックについては、疑似フレーム処理を行って静止画を作成し、動いていないと判定されたブロックについてはそのまま静止画を作成する。ここで、疑似フレーム処理とは、インターレース画像で、フィールド画像を補間することによってフレーム画像を生成する処理を意味する。
【００９１】
インターレース画像の動きのある部分をそのままフレームとして静止画を作成した場合、静止画がぶれたものになってしまう。そこで、動きのあると判定された部分に疑似フレーム処理を施すことにより、静止画のぶれを押さえることが出来るようになる。
【００９２】
このように、本発明の動き検出装置は、画像符号化装置のみならず、ビデオカメラなどの、インターレース画像から静止画を作成する装置にも適用することが出来る。
【００９３】
なお、本発明の動き検出装置や動き検出方法は、上述したように画像符号化装置やインターレース画像から静止画を作成する装置のみならず、量子化する際やその他の用途で、インターレース画像の動きの有無に応じて何らかの設定を変える必要のあるものにも適用することが出来る。
【００９４】
なお、本発明に係るプログラムは、上述した本発明の動き検出装置の全部又は一部の手段（又は、装置、素子、回路、部等）の機能をコンピュータにより実行させるためのプログラムであって、コンピュータと協働して動作するプログラムである。
【００９５】
本発明に係るプログラムは、上述した本発明の適応直交変換モード判定装置の全部又は一部の手段（又は、装置、素子、回路、部等）の機能をコンピュータにより実行させるためのプログラムであって、コンピュータと協働して動作するプログラムである。
【００９６】
本発明に係る媒体は、上述した本発明の動き検出装置の全部又は一部の手段の全部又は一部の機能をコンピュータにより実行させるためのプログラムを担持した媒体であり、コンピュータにより読み取り可能且つ、読み取られた前記プログラムが前記コンピュータと協動して前記機能を実行する媒体である。
【００９７】
本発明に係る媒体は、上述した本発明の適応直交変換モード判定装置の全部又は一部の手段の全部又は一部の機能をコンピュータにより実行させるためのプログラムを担持した媒体であり、コンピュータにより読み取り可能且つ、読み取られた前記プログラムが前記コンピュータと協動して前記機能を実行する媒体である。
【００９８】
尚、本発明の上記「一部の手段（又は、装置、素子、回路、部等）」、本発明の上記「一部のステップ（又は、工程、動作、作用等）」とは、それらの複数の手段又はステップの内の、幾つかの手段又はステップを意味し、あるいは、一つの手段又はステップの内の、一部の機能又は一部の動作を意味するものである。
【００９９】
又、本発明のプログラムの一利用形態は、コンピュータにより読み取り可能な記録媒体に記録され、コンピュータと協働して動作する態様であっても良い。
【０１００】
又、本発明のプログラムの一利用形態は、伝送媒体中を伝送し、コンピュータにより読みとられ、コンピュータと協働して動作する態様であっても良い。
【０１０１】
又、本発明のデータ構造としては、データベース、データフォーマット、データテーブル、データリスト、データの種類などを含む。
【０１０２】
又、記録媒体としては、ＲＯＭ等が含まれ、伝送媒体としては、インターネット等の伝送媒体、光・電波・音波等が含まれる。
【０１０３】
又、上述した本発明のコンピュータは、ＣＰＵ等の純然たるハードウェアに限らず、ファームウェアや、ＯＳ、更に周辺機器を含むものであっても良い。
【０１０４】
尚、以上説明した様に、本発明の構成は、ソフトウェア的に実現しても良いし、ハードウェア的に実現しても良い。
【０１０５】
【発明の効果】
以上説明したところから明らかなように、本発明は、画像信号の動きの有無を適切に検出することが出来る動き検出装置、動き検出方法、適応直交変換モード判定装置、適応直交変換モード判定方法、媒体、及びプログラムを提供することが出来る。
【０１０６】
また、本発明は、画像信号の動きの有無を判定する際、画像信号の複雑度やノイズの重畳度による判定ミスが生じにくい動き検出装置、動き検出方法、適応直交変換モード判定装置、適応直交変換モード判定方法、媒体、及びプログラムを提供することが出来る。
【０１０７】
また、本発明は、画像符号化に用いる直交変換器よりも非常に簡単な構成で実現可能であるため、小規模な回路構成または演算量で画像信号の動きの有無を判定する動き検出装置、動き検出方法、適応直交変換モード判定装置、適応直交変換モード判定方法、媒体、及びプログラムを提供することが出来る。
【図面の簡単な説明】
【図１】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の構成図である。
【図２】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の各モードの直交変換ブロックの生成方法を示す概念図である。
（ａ）マクロブロックの構成を示す図である。
（ｂ）フレーム内直交変換ブロックの構成を示す図である。
（ｃ）フィールド内直交変換ブロックの構成を示す図である。
【図３】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の所定の係数データの一例を示す概念図である。
【図４】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の重み１の場合のモード判定を説明する概念図である。
【図５】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の特定画像パターンを示す概念図である。
（ａ）第１の特定画像パターンを示す図である。
（ｂ）第２の特定画像パターンを示す図である。
【図６】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の所定の係数データの第２の例を示す概念図である。
【図７】本発明の実施の形態１における適応直交変換モード判定方法を適用した画像符号化装置の重み付け比較によるモード判定を説明する概念図である。
【図８】本発明の実施の形態２における適応直交変換モード判定方法を適用した画像符号化装置の境界線の一例を説明する概念図である。
【符号の説明】
１０入力端子
１１フレーム内直交変換ブロック生成手段
１２フィールド内直交変換ブロック生成手段
１３第１の直交変換器
１４第２の直交変換器
１５第１の代表値検出手段
１６第２の代表値検出手段
１７モード判定手段
１８選択器
１９第３の直交変換器
２０出力端子[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a motion detection device, a motion detection method, an adaptive orthogonal transform mode determining device, an adaptive orthogonal transform mode determining method, a medium, and a program that determine the motion of an interlaced image signal.
[0002]
[Prior art]
2. Description of the Related Art In recent years, in digital signal processing of an image signal of a digital video device or the like, high-efficiency encoding technology has been actively developed in order to perform recording / reproduction at a limited transmission rate. The high-efficiency coding is a technique for compressing a data amount by using the redundancy of an image signal. As the high-efficiency coding method, a method of dividing an image frame into blocks of, for example, 8 × 8 pixels (hereinafter, referred to as an orthogonal transformation block), performing orthogonal transformation in units of orthogonal transformation blocks, and compressing the image frame is often used. The coefficient data calculated by the orthogonal transformation is quantized with a predetermined quantization width, subjected to statistically determined variable length coding, and recorded with improved compression efficiency.
[0003]
Normally, since an image signal has redundancy, coefficient data after orthogonal transformation tends to have a large value in a low-frequency component and a small value in a high-frequency component. In addition, the compression distortion generated in the high-frequency component is hardly noticeable due to human visual characteristics. Therefore, the higher-frequency component is quantized with a larger quantization width. Therefore, coefficient data after quantization tends to become zero as the frequency becomes higher. Utilizing this feature, variable-length coding uses a number of consecutive zeros and a non-zero value that appears immediately after the data sequence obtained by rearranging the quantized coefficient data in a predetermined scan order from low to high. Is variable-length coded into a predetermined statistically determined code. As described above, since the number of consecutive zeros tends to increase as the frequency becomes higher, the number of codes to be assigned can be reduced, and efficient encoding can be performed.
[0004]
However, in the case of an interlaced image signal, when the imaged object is moving or when the object is imaged by moving the camera such as panning, there is a time difference between the two fields constituting the image frame. The correlation between adjacent pixels becomes weaker. In a block having such a motion, if orthogonal transformation is performed in a state where two fields are combined into a frame (hereinafter referred to as intra-frame orthogonal transformation), a large high-frequency component appears, and the encoding efficiency deteriorates. I will. In other words, a large high-frequency coefficient is coded without becoming zero even after quantization, so that the generated code amount increases.
[0005]
In such a case, it is better to perform orthogonal transformation by blocking each field (hereinafter referred to as intra-field orthogonal transformation), since a large high-frequency component does not appear, thereby improving the coding efficiency. This is because the correlation within a field is stronger than the correlation between fields. In addition, if a moving part is coded using the intra-frame orthogonal transform, the motion may not be able to be faithfully reproduced due to the occurrence of quantization distortion. Therefore, the intra-field orthogonal transform is preferably used.
[0006]
Therefore, a mode in which orthogonal transformation is performed on framed blocks (intra-frame orthogonal transformation) and a mode in which orthogonal transformation is performed on blocks separated for each field (intra-field orthogonal transformation) are switched for each block. Perform encoding. That is, it is desirable to perform the intra-frame orthogonal transformation in the stationary part and perform the intra-field orthogonal transformation in the moving part. The switching of the orthogonal transform mode may be performed in units of orthogonal transform blocks, or may be performed in units of macroblocks in which a plurality of orthogonal transform blocks are collected.
[0007]
When the orthogonal transform mode is switched as described above, it is necessary to use a mode determination method that accurately detects a still portion and a moving portion of an image frame and selects an appropriate orthogonal transform mode (for example, , Patent Documents 1 and 2).
[0008]
As a conventional mode determination method, a sum of absolute differences between vertically adjacent pixels (between lines) is calculated for a framed orthogonal transform block or macroblock, and the sum is compared with a predetermined threshold. Some make decisions. If the sum of absolute differences is smaller than the threshold value, intra-frame orthogonal transform is performed, and if the sum is larger, intra-field orthogonal transform is performed. That is, in a portion where there is movement, the difference between pixels located above and below tends to be large and becomes larger than the set threshold value, so that the movement can be detected.
[0009]
As another conventional mode determination method, a method in which orthogonal transformation, quantization, and variable length coding are performed for each mode, a mode in which the generated code amount is smaller, and a code generated in that mode is used. There is.
[0010]
As described above, in an interlaced image, a mode in which orthogonal transformation is performed on a framed block (intra-frame orthogonal transformation) and a mode in which orthogonal transformation is performed on a block separated for each field (intra-field orthogonal transformation) For example, when performing encoding while switching each block, it is necessary to detect the motion of each block of the interlaced image.
[0011]
[Patent Document 1]
JP-A-7-87448
[Patent Document 2]
JP-A-10-136367
[0012]
[Problems to be solved by the invention]
However, when detecting the motion of each block of the interlaced image as described above, there are the following problems.
[0013]
In the first conventional method, for example, in the case of a complicated picture or a large amount of noise even in a stationary portion, the difference between lines is large, and the sum of absolute differences is likely to be large. Therefore, it often becomes larger than the predetermined threshold value, and it is determined that there is a motion even in a stationary portion.
[0014]
However, in the stationary part, encoding efficiency is better if the intra-frame orthogonal transform is performed. This is because the correlation between pixels is stronger in the frame than in the case where the field is separated. The reason why such erroneous detection occurs is that the method of comparing the sum of absolute difference values with a threshold value analyzes which is more efficient when performing orthogonal transformation in a frame or when performing orthogonal transformation in a field. The reason is that the field mode is always selected when the correlation becomes weaker than a certain level.
[0015]
Also, when applying the orthogonal transform mode determination to a macroblock, it is difficult to detect a partial motion in the macroblock. That is, when only one orthogonal transform block in a macroblock has motion, the absolute difference value of this block is relatively large, but the absolute difference value of another orthogonal transform block is very small. Are averaged, and it is determined that the block does not move.
[0016]
Therefore, when detecting the motion of the block of the interlaced image by the first conventional method, there is a problem that appropriate detection cannot be performed depending on the picture.
[0017]
In the second conventional method, for example, when realized by hardware, it is necessary to use two sets of circuits for orthogonal transform, quantization, and variable-length coding for each mode, thereby increasing the circuit scale.
[0018]
If the operation is performed without increasing the circuit scale, the operating frequency of the circuit needs to be twice or more, and the power consumption increases. Even in the case of realization by software, the amount of calculation becomes very large. In addition, since the mode determination is performed only based on the generated code amount, no motion element is included. Therefore, there is a possibility that the intra-frame orthogonal transform is performed without considering the partial motion in the macro block.
[0019]
That is, when the motion of the block of the interlaced image is detected by the second conventional method, the circuit scale is increased, and when the detection is performed without increasing the circuit scale, the power consumption is increased. There are issues.
[0020]
The present invention has been made in consideration of the above problems, and has been made in consideration of the above-described problems, a motion detection device, a motion detection method, and an adaptive orthogonal transform mode determination that are not easily affected by the complexity and noise amount of an image and that can appropriately detect a motion according to the motion of an interlaced image An object is to provide an apparatus, an adaptive orthogonal transform mode determination method, a medium, and a program.
[0021]
Another object of the present invention is to provide a motion detecting device, a motion detecting method, an adaptive orthogonal transform mode determining device, an adaptive orthogonal transform mode determining method, a medium, and a program that can be realized with a simple circuit configuration. is there.
[0022]
[Means for Solving the Problems]
In order to solve the above-described problem, a first aspect of the present invention is a motion detection device that detects a motion of each of a plurality of pixel blocks obtained by dividing an interlaced image signal, wherein the number of the pixel blocks is n (n ≧ 1). An intra-frame orthogonal transform block generating means (11 in FIG. 1) for dividing the frame into orthogonal intra-frame orthogonal transform blocks;
First orthogonal transformation means (13 in FIG. 1) for performing an orthogonal transformation for motion detection on each of the divided intraframe orthogonal transformation blocks;
A first representative value detecting means for extracting m (m ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data, and setting the selected coefficient data as a frame representative value x; (15 in FIG. 1),
An intra-field orthogonal transformation block generating means (12 in FIG. 1) for dividing the pixel block into predetermined k (k ≧ 1) intra-field orthogonal transformation blocks;
Second orthogonal transformation means (14 in FIG. 1) for performing an orthogonal transformation for motion detection on each of the divided intra-field orthogonal transformation blocks;
Second representative value detecting means for extracting s (s ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting it as a field representative value y (16 in FIG. 1),
The motion detection device includes a mode determination unit (17 in FIG. 1) that determines whether or not the pixel block has moved by comparing the frame representative value x with the field representative value y.
[0023]
A second aspect of the present invention is the motion estimation device according to the first aspect, wherein the predetermined coefficient data is a coefficient indicating a specific vertical Takagi component.
[0024]
In a third aspect of the present invention, the frame representative value x and the field representative value y are obtained by selecting a maximum value among absolute values of the extracted predetermined coefficient data in each orthogonal transform. 1 is a motion detection device according to a first aspect of the present invention.
[0025]
According to the fourth aspect of the present invention, the determination of the presence or absence of the motion includes: (1) directly comparing the frame representative value x and the field representative value y; or (2) comparing both or one of the representative values with a predetermined function. Alternatively, comparing the values converted by the table values, it is determined that there is no motion when the frame representative value x is smaller, and it is determined that there is motion when the field representative value y is smaller. 3 is a motion detection device according to the invention.
[0026]
In the fifth aspect of the present invention, the determination of the presence or absence of the motion means that the frame representative value x is equal to or less than a predetermined threshold even if the frame representative value x is larger than the field representative value y. It is a motion detecting apparatus according to a fourth aspect of the present invention that determines that there is no motion.
[0027]
The sixth invention is the motion estimation device according to the first invention, wherein each of the orthogonal transforms calculates only the predetermined coefficient data to be extracted.
[0028]
Further, according to a seventh aspect of the present invention, when the interlaced image signal is divided into a plurality of pixel blocks and the pixel blocks are encoded, the image encoding frame orthogonal transform or the image encoding field is encoded for each pixel block. An adaptive orthogonal transform mode determination device that performs control for switching orthogonal transform,
A motion determining means for determining whether or not the pixel block has moved;
(1) When it is determined that the pixel block has motion, control is performed so that field orthogonal transformation for image encoding is performed on the pixel block. (2) When it is determined that the pixel block does not move, the pixel is controlled. Control means (18 in FIG. 1) for controlling the block so as to perform an image coding frame orthogonal transform;
The motion determining means is an adaptive orthogonal transform mode determining device using any one of the first to sixth motion detecting devices of the present invention.
[0029]
An eighth invention is a motion detection method for detecting a motion of each of a plurality of pixel blocks obtained by dividing an interlaced image signal,
An intra-frame orthogonal transformation block generating step of dividing the pixel block into n (n ≧ 1) intra-frame orthogonal transformation blocks;
A first orthogonal transformation step of performing an orthogonal transformation for motion detection on each of the divided intra-frame orthogonal transformation blocks;
A first representative value detecting step of extracting m (m ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting the selected frame as a frame representative value x; When,
Generating an intra-field orthogonal transformation block that divides the pixel block into predetermined k (k ≧ 1) intra-field orthogonal transformation blocks;
A second orthogonal transformation step of performing an orthogonal transformation for motion detection on each of the divided intra-field orthogonal transformation blocks;
A second representative value detecting step of extracting s (s ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting it as a field representative value y When,
A mode determining step of determining whether or not the pixel block has moved by comparing the frame representative value x with the field representative value y.
[0030]
Further, according to a ninth aspect of the present invention, when the interlaced image signal is divided into a plurality of pixel blocks, and each pixel block is encoded, a frame orthogonal transform for image encoding or a field for image encoding is performed for each pixel block. An adaptive orthogonal transform mode determination method for performing control for switching orthogonal transform,
A motion determining step of determining whether or not there is a motion of the pixel block;
(1) When it is determined that the pixel block has motion, control is performed so that field orthogonal transformation for image encoding is performed on the pixel block. (2) When it is determined that the pixel block does not move, the pixel is controlled. And a control step of controlling the block to perform an image encoding frame orthogonal transform,
The motion determination step is an adaptive orthogonal transform mode determination method using the eighth motion detection method of the present invention.
[0031]
In a tenth aspect of the present invention, in the motion estimation apparatus according to the first aspect of the present invention, an intra-frame orthogonal transformation block generating means (FIG. 1) for dividing the pixel block into n (n ≧ 1) intra-frame orthogonal transformation blocks. 11),
First orthogonal transformation means (13 in FIG. 1) for performing an orthogonal transformation for motion detection on each of the divided intraframe orthogonal transformation blocks;
A first representative value detecting means for extracting m (m ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data, and setting the selected coefficient data as a frame representative value x; (15 in FIG. 1),
An intra-field orthogonal transformation block generating means (12 in FIG. 1) for dividing the pixel block into predetermined k (k ≧ 1) intra-field orthogonal transformation blocks;
Second orthogonal transformation means (14 in FIG. 1) for performing an orthogonal transformation for motion detection on each of the divided intra-field orthogonal transformation blocks;
Second representative value detecting means for extracting s (s ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting it as a field representative value y (16 in FIG. 1),
This is a program for causing a computer to function as mode determination means (17 in FIG. 1) for determining whether or not the pixel block has moved by comparing the frame representative value x with the field representative value y.
[0032]
An eleventh aspect of the present invention is a medium that carries the program of the tenth aspect of the present invention, and is a medium that can be processed by a computer.
[0033]
ADVANTAGE OF THE INVENTION According to this invention, it is hard to be affected badly by the complexity of an image, or the degree of superposition of noise, for example, and can determine the detection of an appropriate motion according to the motion of an image. In addition, it can be realized with a very simple configuration, and it is possible to determine a motion detection that can suppress an increase in circuit scale and power consumption.
[0034]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0035]
(Embodiment 1)
Hereinafter, an image coding apparatus to which the adaptive orthogonal transform mode determination method according to the first embodiment of the present invention is applied will be described. FIG. 1 is a configuration diagram of an image encoding device according to the present embodiment.
[0036]
In FIG. 1, reference numeral 10 denotes an input terminal, 11 denotes an intra-frame orthogonal transform block generating unit, 12 denotes an intra-field orthogonal transform block generating unit, 13 denotes a first orthogonal transformer, and 14 denotes a second orthogonal transform unit. , 15 is first representative value detecting means, 16 is second representative value detecting means, 17 is mode determining means, 18 is a selector, and 19 is a third , And 20 is an output terminal.
[0037]
Note that the first orthogonal transformer 13 of the present embodiment is an example of the first orthogonal transformer of the present invention, and the second orthogonal transformer 14 of the present embodiment is the second orthogonal transformer of the present invention. This is an example of the means, and the selector 18 of the present embodiment is an example of the control means of the present invention.
[0038]
The operation of the image coding apparatus thus configured will be described below.
[0039]
First, an interlaced image signal is divided into predetermined pixel blocks and input from an input terminal 10.
[0040]
Here, the pixel block means either an intra-frame orthogonal transformation (performing an orthogonal transformation on a framed block) or an intra-field orthogonal transformation (performing an orthogonal transformation on a block separated for each field). A unit for determining whether to perform orthogonal transform in the transform mode, and may be a single orthogonal transform block or a macro block in which a plurality of orthogonal transform blocks are collected.
[0041]
This pixel block is input to the intra-frame orthogonal transformation block generation unit 11 and the intra-field orthogonal transformation block generation unit 12.
[0042]
Next, the intra-frame orthogonal transformation block generation unit 11 divides the input pixel block in a framed state, and generates an intra-frame orthogonal transformation block. Similarly, the intra-field orthogonal transform block generation unit 12 divides the input pixel block by field separation and generates an intra-field orthogonal transform block.
[0043]
FIG. 2 shows how each orthogonal transform block is divided into blocks. First, FIG. 2A shows a macroblock which is a pixel block to which the orthogonal transform mode determination is applied. In this case, the macro block is composed of 16 × 16 pixels and is divided into four orthogonal transform blocks. In addition, since the interlaced image signal is divided into macroblocks and input from the input terminal 10, the macroblock is composed of two fields. Therefore, in FIGS. 2A, 2B, and 2C, it is assumed that white pixels are even fields and pixels shaded are odd fields.
[0044]
When performing intra-frame orthogonal transformation on this macroblock, intra-frame orthogonal transformation block formation as shown in FIG. 2B is performed and orthogonal transformation is performed. That is, the macroblock having the above-described frame configuration is divided into 8 × 8 pixel orthogonal transform blocks, which are orthogonal transform units, in a state where the macroblocks are field-combined.
[0045]
On the other hand, when performing the intra-field orthogonal transform, an intra-field orthogonal transform block is formed as shown in FIG. That is, the above-described macroblock is field-separated, and is divided into 8 × 8 pixel orthogonal transform blocks, which are orthogonal transform units, for each field.
[0046]
Next, the first orthogonal transformer 13 orthogonally transforms the intra-frame orthogonal transformation block output from the intra-frame orthogonal transformation block generation means 11 and outputs coefficient data. Then, the first representative value detecting means 15 extracts only predetermined coefficient data from the coefficient data and converts it into an absolute value. The predetermined coefficient data is coefficient data indicating a predetermined specific frequency component.
[0047]
FIG. 3 shows an example of the predetermined coefficient data. FIG. 3 is a schematic diagram of coefficient data generated by orthogonal transformation, and shows 64 coefficient data generated when orthogonal transformation is performed on an orthogonal transformation block of 8 × 8 pixels. In FIG. 3, the upper left coefficient data indicates a lower band, and the lower right coefficient indicates a high band. Further, the upper leftmost coefficient data indicates a direct current component (DC). Further, in FIG. 3, coefficient data indicated by A is coefficient data indicating a vertical high frequency component, and is selected as predetermined coefficient data.
[0048]
Therefore, the first representative value detecting means 15 extracts the coefficient data at the position A for each of the four orthogonal transformation blocks in the frame existing in the macroblock, and converts the coefficient data into absolute values. Then, the maximum value among the absolute values is detected and output as the frame mode representative value x.
[0049]
Similarly, the second orthogonal transformer 13 orthogonally transforms the intra-field orthogonal transformation block output from the intra-field orthogonal transformation block generation means 12 and outputs coefficient data. Then, the second representative value detecting means 16 extracts only predetermined coefficient data from the coefficient data and converts it into an absolute value. The predetermined coefficient data at this time is also the coefficient data at the position A shown in FIG.
[0050]
Accordingly, the second representative value detecting means 16 extracts the coefficient data at the position A for each of the four intra-field orthogonal transformation blocks existing in the macroblock, and converts the coefficient data into absolute values. Then, the maximum value among the absolute values is detected and output as the field mode representative value y.
[0051]
Next, the mode determination means 17 compares the generated frame mode representative value x with the field mode representative value y by weighting. First, for the sake of simplicity, a comparison in the case where the weight is 1 will be described. In this case, the frame mode representative value x and the field mode representative value y are compared as they are, and the smaller mode is selected as a determination result. For example, if the representative value y is smaller than the representative value x, the field mode is selected.
[0052]
FIG. 4 shows a conceptual diagram of this determination operation. In FIG. 4, the frame mode determination area and the field mode determination area are separated by a boundary line represented by a function of y = x on the xy plane. The mode determining means 17 operates so as to plot the point of the representative value (x, y) in each macroblock and select the mode of the plotted area as the determination result.
[0053]
For example, the point S shown in FIG. 4 is a point where the representative value (x, y) of a certain macroblock is plotted. In this case, since it belongs to the field determination area, it is determined that the mode is the field mode.
[0054]
The point of the representative value (x, y) of the macroblock including the moving part appears unbalanced near the x-axis on the xy plane. Conversely, the point of the representative value (x, y) of the macroblock that does not include the moving part appears unbalanced near the y-axis. Therefore, it is possible to classify a moving part and a non-moving part. Thus, by comparing the representative values (x, y) of the macroblock, the motion of the macroblock can be detected.
[0055]
The mode information determined by the mode determination means 17 is provided to the selector 18. The above-described intra-frame orthogonal transform block data and intra-field orthogonal transform block data to which the mode determination is to be applied are input to the selector 18. Then, one of the orthogonal transform block data is selected based on the mode information from the mode determining means 17. That is, the mode determination unit 17 can determine the presence or absence of a macroblock motion by comparing the frame representative value x with the field representative value y.
[0056]
A third orthogonal transformer 19 orthogonally transforms the orthogonal transformation block data output from the selector 18. Then, the orthogonally transformed coefficient data is output from the output terminal 20.
[0057]
By the way, in the above description of how to determine the predetermined coefficient data, only one vertical high-frequency coefficient denoted as A of each orthogonal transform block is selected.
[0058]
FIG. 5 is a diagram for explaining how to determine predetermined coefficient data. Hereinafter, description will be made with reference to FIG. That is, FIG. 5 shows an example of an intra-frame orthogonal transformation block.
[0059]
First, since the coefficient A in the intra-frame orthogonal transform is coefficient data indicating a vertical high-frequency component, the coefficient A takes a large value when the correlation between the fields is weakened due to the movement of the object. That is, as shown in FIG. 5A, the orthogonal transformation block in the frame has a stripe pattern. In this case, the correlation is stronger when the field is separated, and the coefficient A becomes smaller when the field mode orthogonal transform is performed. Therefore, the field mode can be selected.
[0060]
However, there are various patterns included in the block depending on the type of edge of the object, the manner of movement, the speed, and the like. For example, when a stripe pattern as shown in FIG. 5B is obtained, the coefficient A is unlikely to be large even when the orthogonal transformation is performed within a frame, so that an erroneous mode determination may be performed. However, the coefficient next to the coefficient A is actually large. Therefore, if a coefficient next to the coefficient A is selected as the predetermined coefficient, more accurate mode determination can be performed.
[0061]
FIG. 6 is a diagram showing the coefficient data selected in this manner, and shows that the coefficient A and the adjacent coefficient B are selected as the predetermined coefficient data as described above. Actually, a pattern as shown in FIG. 5B often occurs when a thin edge moves quickly. As described above, if a plurality of specific coefficient data are selected based on a pattern that is likely to occur as an image, more accurate mode determination can be performed.
[0062]
Next, the weighted comparison will be further described. In the above description, the comparison is performed in a state where the weight is 1, that is, when there is no weight. However, by performing the weighted comparison, it is possible to perform a mode determination for further improving the coding efficiency. Weighting refers to changing the above-described method of drawing a boundary line, and is the same as imparting a bias to the range of the mode determination area for frames and fields.
[0063]
FIG. 7 is a diagram in which a certain boundary line is set and divided into areas. In this case, when the representative values (x, y) have similar values, the setting is such that it is easy to enter the frame determination. The method of assigning weights, that is, the method of setting a boundary line can be arbitrarily determined. It may be represented by an appropriate function or by a table value. Further, the function may be linear or non-linear.
[0064]
The determination by weighting the mode determination area in this way is performed, for example, by substituting the representative value x into the function or the table, comparing the converted value with the representative value y, and selecting the smaller mode. Is equal to Also, it is equivalent to, for example, substituting the representative value y into the function or table, comparing the converted value with the representative value x, and selecting the smaller mode. Further, it is equivalent to, for example, substituting the representative value x and the representative value y into respective functions or tables, comparing the converted values, and selecting the smaller mode.
[0065]
When the amount of movement of the object is large or when the object is stationary, the difference between the representative values is likely to be large, and as described above, it is likely to appear eccentric to each axis, so that it is easy to accurately determine the mode. However, when the original object is complicated or noise is superimposed, it is not known which mode is selected because the difference between the representative values becomes small, except when the object is moving largely.
[0066]
In this case, it is better to put the frame as much as possible. This is because the motion amount is originally not so large, so that the correlation is often stronger when framed than for each field. Therefore, when the difference between the representative values is not so large, a more appropriate mode determination can be performed by weighting and comparing so that the intra-frame orthogonal transform is selected.
[0067]
As described above, in the first embodiment of the present invention, the vertical high-frequency coefficient serving as an index of motion is obtained in advance, and the smaller orthogonal transform mode is selected. Also, by comparing the vertical high frequency coefficients of each of the four orthogonal transform blocks in the macroblock, determining the maximum value, and using the maximum value as the representative value of the mode, a moving part in the macroblock is appropriately detected. can do. In addition, since the mode is determined by comparing the representative values obtained for each mode, it is less susceptible to noise. Since the white noise superimposed on the image has no correlation between the noise particles, it is considered that a characteristic coefficient distribution is not obtained even when the orthogonal transform is performed in either mode. Therefore, a large vertical high-frequency coefficient due to noise is likely to appear in either mode, but not so large that the original motion characteristics of the image are hidden. Therefore, by relatively comparing the representative values, it is possible to reduce a mode determination error due to an adverse effect of noise. Furthermore, the weighted comparison can reduce a determination error due to noise in a portion having no motion.
[0068]
Therefore, according to Embodiment 1 of the present invention, it is possible to perform an appropriate orthogonal transformation mode determination so that the coding efficiency does not deteriorate due to the motion of an image. Further, it is possible to determine the orthogonal transformation mode that is not easily affected by the complexity of the image or the degree of superposition of noise.
[0069]
(Embodiment 2)
Hereinafter, the operation of the image coding apparatus to which the adaptive orthogonal transform mode determination method according to the second embodiment of the present invention is applied will be described. The image coding device according to the present embodiment is the same as the configuration diagram of the image coding device according to the first embodiment. However, the operation of the mode determining means 17 is different. Hereinafter, the operation of the mode determination means will be described. The description of the other components having the same function is omitted.
[0070]
The mode determining unit 17 receives the frame mode representative value x output from the first representative value detecting unit 15 and the field mode representative value y output from the second representative value detecting unit 16. As described in the first embodiment, the mode determining means 17 weights and compares each representative value (x, y) and selects the smaller mode. By comparing the threshold value T with the threshold value T, when the frame mode representative value x is smaller than the threshold value T, the mode is determined so that the frame mode is always selected.
[0071]
FIG. 8 shows a conceptual diagram of this mode determination method. In FIG. 8, on the xy plane for determining the representative value (x, y), the frame determination area and the field determination area are separated by a predetermined boundary defining weighting. All areas smaller than a certain threshold T are set as frame determination areas.
[0072]
In an image signal, particularly in a flat stationary portion, even when the intra-frame orthogonal transform and the intra-field orthogonal transform are performed, in both cases, a large value is hardly generated in the high frequency component. Therefore, representative values having small values are compared with each other. In this case, the difference between the representative values is very small, and the mode determination result may frequently change even between blocks including similar images.
[0073]
In other words, when the mode determination is switched between a close block and a block at the same position in a close frame, the appearance of the compression distortion changes, so that the distortion becomes conspicuous. Further, when the frame mode representative value x is small, almost all of the stationary portion is a stationary portion. Therefore, performing orthogonal transformation in a frame enhances the correlation between pixels and increases coding efficiency.
[0074]
Therefore, as described above, by performing the determination using the threshold value, the frame mode is always determined when the frame mode representative value x is small, and the coding efficiency is improved.
[0075]
As described above, according to the second embodiment of the present invention, it is possible to reduce erroneous detection in a still portion of an image and perform more accurate orthogonal transform mode determination.
[0076]
(Embodiment 3)
Hereinafter, the operation of the image coding apparatus to which the adaptive orthogonal transform mode determination method according to the third embodiment of the present invention is applied will be described. The image coding device according to the present embodiment is the same as the configuration diagram of the image coding device according to the first embodiment. However, the operation contents of the first orthogonal transformer 13 and the second orthogonal transformer 14 are different. Hereinafter, the operation of each of the orthogonal transformers 13 and 14 will be described. The description of the other components having the same function is omitted.
[0077]
Since the first orthogonal transformer 13 and the second orthogonal transformer 14 perform the same orthogonal transformation as the third orthogonal transformer 19 actually used for encoding, they are the same as the third orthogonal transformer 19. Configuration may be used. However, among the coefficient data calculated by the first orthogonal transformer 13 and the second orthogonal transformer 14, other than the predetermined coefficient data are not used for the mode determination.
[0078]
Therefore, as described in the first embodiment, when one or several pieces of predetermined coefficient data are selected for determination, only those coefficients are calculated. Thereby, the first orthogonal transformer 13 and the second orthogonal transformer 14 have a significantly simpler configuration than the third orthogonal transformer 19. Then, the first representative value detecting means 15 and the second representative value detecting means 16 respectively detect the maximum value using only the generated coefficient data and determine the representative value.
[0079]
Further, the first orthogonal transformer 13 and the second orthogonal transformer 14 are for calculating and analyzing in advance which mode is more advantageous when the orthogonal transformation is actually performed. . Then, the calculation results of the first orthogonal transformer 13 and the second orthogonal transformer 14 are used only for mode determination, and do not become data to be actually encoded.
[0080]
Therefore, the first orthogonal transformer 13 and the second orthogonal transformer 14 operate using an orthogonal transformation equation that is more simplified than the orthogonal transformation equation in the third orthogonal transformer 19 that is actually used for encoding. It is constituted so that.
[0081]
Generally, an orthogonal transform used for encoding requires extremely high-precision arithmetic. This is to prevent distortion from occurring due to an arithmetic error. If such an orthogonal transformer is realized by a circuit, the scale becomes very large. The high-precision orthogonal transformation formula is a matrix operation of multiplying a block of pixel data by a predetermined transformation coefficient represented by a cosine function and adding the product. This conversion coefficient is represented by a decimal number. Normally, an operation using a large-scale multiplier is performed in order to secure the operation accuracy.
[0082]
However, in the simplified orthogonal transformation formula used in the first orthogonal transformer 13 and the second orthogonal transformer 14, each transform coefficient is used after being approximated to a special decimal number. This special decimal number is a decimal number represented by 1 / (2n) and a decimal number that can be calculated by a combination of these decimal numbers. When the approximate conversion coefficient is applied, the above-described multiplication can be realized by bit shift and addition, so that a large-scale multiplier is not required. Further, when the operation is performed by software, the amount of calculation is reduced. Note that the accuracy of the approximate conversion coefficient used can be arbitrarily determined.
[0083]
When the orthogonal transformation equation thus simplified is used, the coefficient data after the calculation includes a certain error, but as described above, the distribution of the representative value (x, y) depends on the feature of the image. , The mode determination can be made substantially equivalent to the case where a high-precision arithmetic expression is used.
[0084]
As described above, according to the third embodiment of the present invention, the configuration of the orthogonal transformer used for mode determination can be simplified, so that a very small circuit configuration can be used in the first or second embodiment. The described accurate adaptive orthogonal transform mode determination can be performed.
[0085]
Note that, in all of the embodiments described above, the case where mode determination is performed in units of macroblocks has been described, but the same applies to the case where mode determination is performed for each orthogonal transform block. For example, when performing an intra-frame orthogonal transform on an 8 × 8 pixel framed orthogonal transform block, the orthogonal transform is performed with the same structure, and when performing an intra-field orthogonal transform, two sets of field-separated blocks are used. In some cases, orthogonal transformation is performed for each 8 × 4 pixel block. At this time, the frame mode representative value x may be the maximum value of the predetermined coefficient data obtained from one block, and the field mode representative value y is the maximum value of the predetermined coefficient data obtained from the two blocks. It should be a value.
[0086]
In the description of all the embodiments, a specific vertical high frequency coefficient is used. However, the present invention is not limited to this, and it is possible to arbitrarily set which coefficient to use, how many to use, and the like. It is.
[0087]
Note that, in all of the above embodiments, the case where an image signal is input to the input terminal 10 has been described. However, for example, when motion compensation prediction is also used, a difference signal may be input. Also in this case, the present invention can be applied to the macroblock of the difference signal.
[0088]
In the above description of all the embodiments, an example of the configuration of the image encoding apparatus to which the present invention is applied has been described. However, another configuration may be used as long as the configuration has the same operation. In addition, although the description has been made regarding hardware, the present invention can be applied to software.
[0089]
In the present embodiment, an image coding apparatus to which the adaptive orthogonal transform mode determination method is applied has been described. However, the present invention is not limited to this. Detection methods can be used.
[0090]
That is, an image signal captured by a video camera is usually an interlaced signal, but there is a video camera having a function of creating a still image from an interlaced signal captured by a video camera. When a still image is created in such a video camera, the presence or absence of motion is detected for each block unit as described in the above embodiments. Then, for a block determined to be moving, a still image is created by performing pseudo frame processing, and for a block determined to be not moving, a still image is created as it is. Here, the pseudo frame process means a process of generating a frame image by interpolating a field image with an interlaced image.
[0091]
When a still image is created using a moving part of an interlaced image as a frame as it is, the still image is blurred. Therefore, by applying a pseudo frame process to a portion determined to have a motion, the blurring of a still image can be suppressed.
[0092]
As described above, the motion detection device of the present invention can be applied not only to an image coding device but also to a device that creates a still image from an interlaced image, such as a video camera.
[0093]
Note that the motion estimation device and the motion estimation method of the present invention are not limited to the image encoding device and the device for creating a still image from the interlaced image as described above, but also include the motion of the interlaced image for quantization and other uses. The present invention can be applied to a case in which some setting needs to be changed depending on the presence or absence of.
[0094]
Note that the program according to the present invention is a program for causing a computer to execute the functions of all or a part of the above-described motion detection device of the present invention (or a device, an element, a circuit, a unit, or the like). A program that operates in cooperation with a computer.
[0095]
A program according to the present invention is a program for causing a computer to execute the functions of all or a part of the above-described adaptive orthogonal transform mode determination device (or device, element, circuit, unit, or the like) of the present invention. , A program that operates in cooperation with a computer.
[0096]
The medium according to the present invention is a medium carrying a program for causing a computer to execute all or a part of the functions of all or part of the motion detecting device of the present invention described above, and is readable by a computer, The read program is a medium that executes the function in cooperation with the computer.
[0097]
The medium according to the present invention is a medium that carries a program for causing a computer to execute all or a part of the functions of all or part of the above-described adaptive orthogonal transform mode determination device of the present invention, and is read by a computer. A medium that is capable of executing the function in cooperation with the computer.
[0098]
The “partial means (or device, element, circuit, unit, etc.)” of the present invention and the “partial steps (or process, operation, operation, etc.)” of the present invention refer to those. It means several means or steps in a plurality of means or steps, or means some functions or some operations in one means or steps.
[0099]
One usage form of the program of the present invention may be such that the program is recorded on a computer-readable recording medium and operates in cooperation with the computer.
[0100]
One use form of the program of the present invention may be a form in which the program is transmitted through a transmission medium, read by a computer, and operates in cooperation with the computer.
[0101]
Further, the data structure of the present invention includes a database, a data format, a data table, a data list, a type of data, and the like.
[0102]
The recording medium includes a ROM and the like, and the transmission medium includes a transmission medium such as the Internet, light, radio waves, and sound waves.
[0103]
Further, the computer of the present invention described above is not limited to pure hardware such as a CPU, but may include firmware, an OS, and peripheral devices.
[0104]
Note that, as described above, the configuration of the present invention may be realized by software or hardware.
[0105]
【The invention's effect】
As is apparent from the above description, the present invention is a motion detecting device, a motion detecting method, an adaptive orthogonal transform mode determining device, an adaptive orthogonal transform mode determining method capable of appropriately detecting the presence or absence of motion of an image signal, A medium and a program can be provided.
[0106]
In addition, the present invention provides a motion detection device, a motion detection method, an adaptive orthogonal transform mode determination device, and an adaptive orthogonal transform mode in which a determination error due to the complexity of an image signal and the degree of superposition of noise is less likely to occur when determining the presence or absence of motion of an image signal. A conversion mode determination method, a medium, and a program can be provided.
[0107]
Further, since the present invention can be realized with a very simple configuration than the orthogonal transformer used for image encoding, a motion detection device that determines the presence or absence of motion of an image signal with a small circuit configuration or the amount of computation, A motion detection method, an adaptive orthogonal transform mode determining device, an adaptive orthogonal transform mode determining method, a medium, and a program can be provided.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of an image encoding device to which an adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
FIG. 2 is a conceptual diagram showing a method of generating an orthogonal transform block in each mode of an image encoding device to which an adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
(A) is a diagram showing a configuration of a macro block.
(B) is a diagram showing a configuration of an intra-frame orthogonal transform block.
(C) is a diagram illustrating a configuration of an intra-field orthogonal transform block.
FIG. 3 is a conceptual diagram showing an example of predetermined coefficient data of an image encoding device to which the adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
FIG. 4 is a conceptual diagram illustrating mode determination in the case of a weight of 1 of the image encoding device to which the adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
FIG. 5 is a conceptual diagram showing a specific image pattern of the image encoding device to which the adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
(A) is a diagram showing a first specific image pattern.
FIG. 6B is a diagram illustrating a second specific image pattern.
FIG. 6 is a conceptual diagram showing a second example of predetermined coefficient data of the image encoding device to which the adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
FIG. 7 is a conceptual diagram illustrating mode determination by weighting comparison of an image encoding device to which the adaptive orthogonal transform mode determination method according to Embodiment 1 of the present invention is applied.
FIG. 8 is a conceptual diagram illustrating an example of a boundary of an image encoding device to which an adaptive orthogonal transform mode determination method according to Embodiment 2 of the present invention is applied.
[Explanation of symbols]
10 Input terminal
11 Intra-frame orthogonal transformation block generation means
12. Intra-field orthogonal transformation block generation means
13 First orthogonal transformer
14 Second orthogonal transformer
15 First representative value detecting means
16. Second representative value detecting means
17 Mode determination means
18 Selector
19 Third orthogonal transformer
20 output terminals

Claims

A motion detection device that detects a motion of each of a plurality of pixel blocks obtained by dividing an interlaced image signal,
An intra-frame orthogonal transformation block generating means for dividing the pixel block into n (n ≧ 1) intra-frame orthogonal transformation blocks;
First orthogonal transform means for performing an orthogonal transform for motion detection on each of the divided intra-frame orthogonal transform blocks;
A first representative value detecting means for extracting m (m ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data, and setting the selected coefficient data as a frame representative value x; When,
An intra-field orthogonal transformation block generating means for dividing the pixel block into predetermined k (k ≧ 1) intra-field orthogonal transformation blocks;
Second orthogonal transform means for performing orthogonal transform for motion detection on each of the divided intra-field orthogonal transform blocks;
Second representative value detecting means for extracting s (s ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting it as a field representative value y When,
A motion determining unit that compares the frame representative value x with the field representative value y to determine whether or not the pixel block has moved.

The motion detection device according to claim 1, wherein the predetermined coefficient data is a coefficient indicating a specific vertical Takagi component.

The motion detection device according to claim 1, wherein the frame representative value x and the field representative value y are obtained by selecting a maximum value among absolute values of the extracted predetermined coefficient data in each orthogonal transform.

The determination of the presence or absence of the motion includes (1) directly comparing the frame representative value x and the field representative value y, or (2) comparing the values obtained by converting both or one of the representative values with a predetermined function or table value. 4. The motion detecting apparatus according to claim 3, wherein when the frame representative value x is smaller, it is determined that there is no motion, and when the field representative value y is smaller, it is determined that there is motion.

Determining the presence / absence of motion means determining that there is no motion when the frame representative value x is equal to or less than a predetermined threshold value, even if the frame representative value x is larger than the field representative value y. Item 5. The motion detection device according to Item 4.

The motion detection device according to claim 1, wherein each of the orthogonal transforms calculates only the predetermined coefficient data to be extracted.

When the interlaced image signal is divided into a plurality of pixel blocks, and when encoding is performed for each of the pixel blocks, adaptive orthogonal control is performed to switch between frame orthogonal transformation for image encoding or field orthogonal transformation for image encoding for each pixel block. A conversion mode determination device,
A motion determining means for determining whether or not the pixel block has moved;
(1) When it is determined that the pixel block has motion, control is performed so that field orthogonal transformation for image encoding is performed on the pixel block. (2) When it is determined that the pixel block does not move, the pixel is controlled. Control means for controlling the block to perform an image coding frame orthogonal transform,
An adaptive orthogonal transform mode determining device, wherein the motion determining device uses the motion detecting device according to claim 1.

A motion detection method for detecting a motion of each of a plurality of pixel blocks obtained by dividing an interlaced image signal,
An intra-frame orthogonal transformation block generating step of dividing the pixel block into n (n ≧ 1) intra-frame orthogonal transformation blocks;
A first orthogonal transformation step of performing an orthogonal transformation for motion detection on each of the divided intra-frame orthogonal transformation blocks;
A first representative value detecting step of extracting m (m ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting the selected frame as a frame representative value x; When,
Generating an intra-field orthogonal transformation block that divides the pixel block into predetermined k (k ≧ 1) intra-field orthogonal transformation blocks;
A second orthogonal transformation step of performing an orthogonal transformation for motion detection on each of the divided intra-field orthogonal transformation blocks;
A second representative value detecting step of extracting s (s ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting it as a field representative value y When,
A mode determination step of determining whether or not the pixel block has moved by comparing the frame representative value x with the field representative value y.

When the interlaced image signal is divided into a plurality of pixel blocks, and when encoding is performed for each of the pixel blocks, adaptive orthogonal control is performed to switch between frame orthogonal transformation for image encoding or field orthogonal transformation for image encoding for each pixel block. A conversion mode determination method,
A motion determining step of determining whether or not there is a motion of the pixel block;
(1) When it is determined that the pixel block has motion, control is performed so that field orthogonal transformation for image encoding is performed on the pixel block. (2) When it is determined that the pixel block does not move, the pixel is controlled. And a control step of controlling the block to perform an image encoding frame orthogonal transform,
An adaptive orthogonal transform mode determination method, wherein the motion determination step uses the motion detection method according to claim 8.

2. An intra-frame orthogonal transformation block generating means for dividing the pixel block into n (n ≧ 1) intra-frame orthogonal transformation blocks, of the motion estimation device according to claim 1.
First orthogonal transform means for performing an orthogonal transform for motion detection on each of the divided intra-frame orthogonal transform blocks;
A first representative value detecting means for extracting m (m ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data, and setting the selected coefficient data as a frame representative value x; When,
An intra-field orthogonal transformation block generating means for dividing the pixel block into predetermined k (k ≧ 1) intra-field orthogonal transformation blocks;
Second orthogonal transform means for performing orthogonal transform for motion detection on each of the divided intra-field orthogonal transform blocks;
Second representative value detecting means for extracting s (s ≧ 1) predetermined coefficient data from each of the orthogonal transform blocks, selecting one of the extracted coefficient data and setting it as a field representative value y When,
A program for causing a computer to function as mode determination means for determining whether or not the pixel block has moved by comparing the frame representative value x with the field representative value y.

A medium carrying the program according to claim 10, wherein the medium can be processed by a computer.