JP2004213202A

JP2004213202A - Image processing method, image processor, image pickup unit, program and recording medium

Info

Publication number: JP2004213202A
Application number: JP2002380033A
Authority: JP
Inventors: Keiichi Ikebe; 慶一池辺; Hiroyuki Sakuyama; 宏幸作山; Taku Kodama; 卓児玉; Takao Inoue; 隆夫井上; Shin Aoki; 伸青木; Ikuko Kusatsu; 郁子草津; Akira Takahashi; 彰高橋; Takashi Maki; 隆史牧; Takanori Yano; 隆則矢野; Takeshi Koyama; 毅小山
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2002-12-27
Filing date: 2002-12-27
Publication date: 2004-07-29
Anticipated expiration: 2022-12-27
Also published as: JP3897253B2

Abstract

<P>PROBLEM TO BE SOLVED: To detect a moving object in a still picture photographed by a digital camera or the like and its moving direction without referring to another picture. <P>SOLUTION: A wavelet conversion processing part 102 executes two-dimensional wavelet conversion by tile units to a still picture fetched from an image source 50. A Yh, Yv calculating part 105 calculates horizontal and vertical high frequency component quantities Yh, Yv by tile units or area units smaller than the tile units from respective HL, LH sub-band coefficients generated by the two-dimensional wavelet conversion. Then, a moving object detecting part 110 judges the presence/absence and moving direction of the moving object in the tiles or small areas based on the values of the Yh, Yv to detect the moving object in the still picture. Also, the HL, LH sub-band coefficients generated in a compression processing or extension processing process based on JPEG 2000 may be used for detecting the moving object. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、画像処理に係り、特に、デジタルカメラなどによって撮影された静止画像中の移動物体の検出及びそれに関連した画像処理に関する。
【０００２】
【従来の技術】
様々な画像処理の分野において、画像中の移動物体の検出が必要となることがある。動画像の場合には、時間的に連続した前後の画像の比較により、比較的容易に移動物体を検出することができる（例えば特許文献１参照）。しかし、静止画像の場合、時間的に接近した画像を用意できないときには、そのような方法を適用できない。
【０００３】
画像は記録又は伝送に先立って圧縮されることが多い。静止画像の圧縮にはＪＰＥＧが広く利用されているが、これに代わる圧縮方式としてＪＰＥＧ２０００（ＩＳＯ／ＩＥＣＦＣＤ１５４４４−１）が注目されている（例えば非特許文献１参照）。
【０００４】
【特許文献１】
特開平１０−１４５７７６号公報
【非特許文献１】
野水泰之著、「次世代画像符号化方式ＪＰＥＧ２０００」、
株式会社トリケップス、２００１年２月１３日
【０００５】
【発明が解決しようとする課題】
よって、本発明の目的は、デジタルカメラなどによって撮影された静止画像中の移動物体を検出するための新規な方法及び装置を提供することにある。本発明のもう１つの目的は、静止画像中の移動物体を検出し、その検出結果に応じて静止画像を圧縮する新規な方法及び装置を提供することにある。本発明のもう１つの目的は、圧縮された静止画像中の移動物体を検出する新規な方法及び装置を提供することにある。本発明のもう１つの目的は、静止画像を圧縮処理した符号化データを移動物体の検出結果に応じて変換する新規な方法及び装置を提供することにある。
【０００６】
【課題を解決するための手段】
本発明の画像処理方法は、請求項１に記載されるように、静止画像に２次元ウェーブレット変換が適用されることにより得られたＨＬサブバンド係数及びＬＨサブバンド係数から、静止画像の領域毎の水平方向及び垂直方向の高周波成分量を計算する処理と、前記高周波成分量に基づいて静止画像中の移動物体を検出する処理とを含むことを特徴とする。
【０００７】
本発明の画像処理方法のもう１つの特徴は、請求項２に記載されるように、請求項１に記載の構成に加え、２次元ウェーブレット変換を含む静止画像の圧縮処理をさらに含み、前記圧縮処理における２次元ウェーブレット変換により生成されるＨＬサブバンド係数及びＬＨサブバンド係数が高周波成分量の計算に用いられることにある。
【０００８】
本発明の画像処理方法のもう１つの特徴は、請求項３に記載されるように、請求項２に記載の構成において、前記圧縮処理は、移動物体の検出結果に従って、移動物体の領域が他の領域より高画質の符号化データを生成することにある。
【０００９】
本発明の画像処理方法のもう１つの特徴は、請求項４に記載されるように、請求項２に記載の構成に加え、前記圧縮処理により生成された符号化データを移動物体の検出結果に従って加工することにより、前記符号化データを移動物体の領域が他の領域より高画質の符号データに変換する処理をさらに含むことにある。
【００１０】
本発明の画像処理方法のもう１つの特徴は、請求項５に記載されるように、請求項１に記載の構成に加え、２次元ウェーブレット変換を含む圧縮処理により圧縮された静止画像の符号化データの伸長処理をさらに含み、前記伸長処理により生成されるＨＬサブバンド係数及びＬＨサブバンド係数が高周波成分量の計算に用いられることにある。
【００１１】
本発明の画像処理方法のもう１つの特徴は、請求項６に記載されるように、請求項５に記載の構成に加え、移動物体の検出結果に従って前記符号化データを加工し、前記符号化データを移動物体の領域が他の領域より高画質の符号データに変換する処理をさらに含むことにある。
【００１２】
本発明の画像処理方法のもう１つの特徴は、請求項７に記載されるように、請求項１乃至６のいずれか１項に記載の構成において、２次元ウェーブレット変換が適用される領域と同一の領域毎の高周波成分量を計算することにある。
【００１３】
本発明の画像処理方法のもう１つの特徴は、請求項８に記載されるように、請求項１乃至６のいずれか１項に記載の構成において、２次元ウェーブレット変換が適用される領域より小さな領域毎の高周波成分量を計算することにある。
【００１４】
本発明の画像処理方法のもう１つの特徴は、請求項９に記載されるように、請求項７又は８に記載の構成において、移動物体検出の処理で、水平方向と垂直方向の高周波成分量のうちの小さい方の高周波成分量が所定の閾値以下の領域を移動物体が含まれる領域と判断することにある。
【００１５】
本発明の画像処理方法のもう１つの特徴は、請求項１０に記載されるように、請求項９に記載の構成において、移動物体検出の処理で、水平方向と垂直方向の高周波成分量の大小関係に基づいて移動物体の移動方向を判断することにある。本発明の画像処理装置は、請求項１１に記載されるように、静止画像に対する２次元ウェーブレット変換を実行するウェーブレット変換処理手段と、前記２次元ウェーブレット変換により生成されたＨＬサブバンド係数及びＬＨサブバンド係数から、静止画像の領域毎の水平方向及び垂直方向の高周波成分量を計算する高周波成分量算出手段と、前記高周波成分量に基づいて静止画像中の移動物体を検出する移動物体検出手段とを有することにある。
【００１６】
本発明の画像処理装置のもう１つの特徴は、請求項１２に記載されるように、請求項１１に記載の構成に加え、静止画像の圧縮処理を実行する圧縮処理手段を有し、前記ウェーブレット変換処理手段は前記圧縮処理手段に含まれることにある。
【００１７】
本発明の画像処理装置のもう１つの特徴は、請求項１３に記載されるように、請求項１２に記載の構成において、前記圧縮処理手段は、前記移動物体検出手段による移動物体の検出結果に従って、移動物体の領域が他の領域より高画質の符号化データを生成することにある。
【００１８】
本発明の画像処理装置のもう１つの特徴は、請求項１４に記載されるように、請求項１２に記載の構成に加え、前記圧縮処理により生成された符号化データを前記移動物体検出手段による移動物体の検出結果に従って加工し、前記符号化データを移動物体の領域が他の領域より高画質の符号データに変換する符号加工手段を有することにある。
【００１９】
本発明の画像処理装置は、請求項１５に記載されるように、２次元ウェーブレット変換を含む圧縮処理により圧縮された静止画像の符号化データに対する伸長処理を実行する伸長処理手段と、前記伸長処理により生成されるＨＬサブバンド係数及びＬＨサブバンド係数から、静止画像の領域毎の水平方向及び垂直方向の高周波成分量を計算する高周波成分量算出手段と、前記高周波成分量に基づいて静止画像中の移動物体を検出する移動物体検出手段を有することにある。
【００２０】
本発明の画像処理装置のもう１つの特徴は、請求項１６に記載されるように、請求項１５に記載の構成に加え、前記移動物体検出手段による移動物体の検出結果に従って前記符号化データを加工し、前記符号化データを移動物体の領域が他の領域より高画質の符号データに変換する符号加工手段を有することにある。
【００２１】
本発明の画像処理装置のもう１つの特徴は、請求項１７に記載されるように、請求項１１乃至１６のいずれか１項に記載の構成において、前記高周波成分量算出手段は２次元ウェーブレット変換が適用される領域と同一の領域毎の高周波成分量を計算することにある。
【００２２】
本発明の画像処理装置のもう１つの特徴は、請求項１８に記載されるように、請求項１１乃至１６のいずれか１項に記載の構成において、前記高周波成分量検出手段は２次元ウェーブレット変換が適用される領域より小さな領域毎の高周波成分量を計算することにある。
【００２３】
本発明の撮像装置は、請求項１９に記載されるように、請求項１１乃至１６のいずれか１項に記載の各手段と、静止画像を撮影するための撮像手段を有することを特徴とする。
【００２４】
【発明の実施の形態】
本発明の実施の形態の説明に先立ち、その理解に必要な範囲でＪＰＥＧ２０００について概説する。
【００２５】
図１は、ＪＰＥＧ２０００の基本的な圧縮・伸長処理の流れを示すブロック図である。圧縮処理の対象となる画像データは、各コンポーネント毎に、重複しない矩形領域（タイル）に分割され、各コンポーネント毎にタイル単位で処理される。ただし、画像全体を１つのタイルとして処理することも可能である。
【００２６】
各コンポーネントの各タイル画像は、色空間変換・逆変換部１で、圧縮率の向上を目的として、ＲＧＢデータやＣＭＹデータからＹＣｒＣｂデータへの色空間変換を施される。この色空間変換が省かれる場合もある。
【００２７】
色空間変換後のタイル画像は、ウェーブレット変換／逆変換部２により、２次元ウェーブレット変換（離散ウェーブレット変換）を施され、複数のサブバンドに分解される。ウェーブレット係数はサブバンド毎に量子化／逆量子化部３によって量子化される。ＪＰＥＧ２０００は可逆圧縮（ロスレス圧縮）と非可逆圧縮（ロシィ圧縮）のいずれも可能であり、可逆圧縮の場合には量子化ステップ幅は常に１であり、この段階では実質的に量子化されない。
【００２８】
量子化後の各サブバンド係数は、エントロピー符号化／復号化部４でエントロピー符号化される。このエントロピー符号化には、ブロック分割、係数モデリング及び２値算術符号化からなるＥＢＣＯＴ（Embedded Block Coding withOptimized Truncation）と呼ばれるブロックベースのビットプレーン符号化方式が用いられる。量子化後の各サブバンド係数のビットプレーンが、上位ビットから下位ビットへ向かって、コードブロックと呼ばれるブロック毎に符号化される。
【００２９】
タグ処理部５において、エントロピー符号化／復号化部４で生成されたコードブロックの符号がまとめられパケットが作成され、次に、パケットがプログレッション順序に従って並べられるとともに必要なタグ情報が付加されることにより、所定のフォーマットの符号化データが作成される。ＪＰＥＧ２０００では、符号順序制御に関して、解像度レベル、位置（プリシンクト)、レイヤ、コンポーネント（色成分)の組み合わせによる５種類のプログレッション順序が定義されている。
【００３０】
このようにして生成されるＪＰＥＧ２０００の符号化データのフォーマットを図２に示す。図２に見られるように、符号化データはその始まりを示すＳＯＣマーカと呼ばれるタグで始まり、その後に符号化パラメータや量子化パラメータ等を記述したメインヘッダ(Main Header)と呼ばれるタグ情報が続き、その後に各タイル毎の符号データが続く。各タイル毎の符号データは、ＳＯＴマーカと呼ばれるタグで始まり、タイルヘッダ(Tile Header)と呼ばれるタグ情報、ＳＯＤマーカと呼ばれるタグ、各タイルの符号列を内容とするタイルデータ（Tile Data）で構成される。最後のタイルデータの後に、終了を示すＥＯＣマーカと呼ばれるタグが置かれる。
【００３１】
伸長処理は圧縮処理と逆の処理となる。符号化データはタグ処理部５で各コンポーネントの各タイルの符号列に分解される。この符号列はエントロピー符号化／復号化部４によってエントロピー復号化される。復号化されたウェーブレット係数は量子化／逆量子化部３で逆量子化されたのち、ウェーブレット変換／逆変換部２で２次元の逆ウェーブレット変換を施されることにより、各コンポーネントの各タイルの画像が再生される。各コンポーネントの各タイル画像は色空間変換／逆変換部１で逆色変換処理を施されてＲＧＢなどのコンポーネントから構成されるタイル画像に戻される。
【００３２】
ここで、本発明に直接関連する２次元ウェーブレット変換についてさらに説明する。図３乃至図６は、モノクロ画像（又はカラー画像の１つのコンポーネント）の16×16画素のタイル画像に対して、ＪＰＥＧ２０００で採用されている５×３変換と呼ばれるウェーブレット変換を垂直方向及び水平方向に施す過程を説明するための図である。
【００３３】
図３は変換前のタイル画像である。図示のようにＸＹ座標をとり、あるｘについて、Ｙ座標がｙである画素の画素値をP（y）（0≦ｙ≦15）と表す。ＪＰＥＧ２０００では、まず垂直方向（Y座標方向）に、Ｙ座標が奇数（y=2i+1）の画素を中心にハイパスフィルタを施して係数C(2i+1)を得る。次に、Ｙ座標が偶数（y=2i）の画素を中心にローパスフィルタを施して係数C(2i)を得る(これを全てのｘについて行う)。ここで、ハイパスフィルタとローパスフィルタはそれぞれ式（１）と式（２）で表される。式中のfloor（ｘ）は、ｘのフロア関数（実数ｘを、ｘを越えず、かつｘに最も近い整数に置換する関数）である。
【００３４】
C(2i+1)=P(2i+1)−floor( (P(2i)+P(2i+2))/2 ) 式(1)
C(2i)=P(2i)+floor( (C(2i-1)+C(2i+1)+2)/4 ) 式(2)
【００３５】
なお、画像の端部においては、中心となる画素に対して隣接画素群が存在しないことがあり、この場合は「ミラリング」と呼ばれる手法によって不足する画素値を補うことになる。ミラリングは、文字通り境界を中心として画素値を線対称に折り返し、折り返した値を隣接画素群の値とみなす操作である。
【００３６】
ハイパスフィルタで得られる係数をＨ、ローパスフィルタで得られる係数をＬ、とそれぞれ表記すれば、垂直方向の変換によって図３の画像は図４のようなＬ係数、Ｈ係数の配列へと変換される。
【００３７】
続いて、図４の係数配列に対して、水平方向に、Ｘ座標が奇数（y=2i+1）の係数を中心にハイパスフィルタを施し、次にＸ座標が偶数（ｘ=2i）の係数を中心にローパスフィルタを施す(これを全てのｙについて行う。この場合、前記の式のP(2i)等は係数値を表すものと読み替える)。
【００３８】
Ｌ係数を中心にローパスフィルタを施して得られる係数をＬＬ、Ｌ係数を中心にハイパスフィルタを施して得られる係数をＨＬ、Ｈ係数を中心にローパスフィルタを施して得られる係数をＬＨ、Ｈ係数を中心にハイパスフィルタを施して得られる係数をＨＨ、とそれぞれ表記すれば、図４の係数配列は図５の様な係数配列へと変換される。ここで同一の記号を付した係数群はサブバンドと呼ばれ、図５は４つのサブバンドで構成される。
【００３９】
以上の処理で、１回のウェーブレット変換（１回のデコンポジション（分解））が終了する。図６は、ウェーブレット係数をサブバンド毎に集めたもので、このように係数を配列することをデインターリーブと呼び、図５のような状態に配置することをインターリーブと呼ぶ。
【００４０】
２回目のウェーブレット変換は、ＬＬサブバンドを原画像と見なして、同様の処理により行われる。その処理結果をデインターリーブすると、図７に示すようなサブバンドの係数が得られる。なお、図６及び図７中の係数の接頭の１や２は、その係数が得られるまでのウェーブレット変換の回数（デコンポジション・レベル）を示す。
【００４１】
図８に、デコンポジション・レベル数＝３の場合のサブバンド分解の様子を示す。なお、図８（ｄ）に示す各サブバンド中の括弧で囲んだ数字は解像度レベルを表す。
【００４２】
ここで、ＪＰＥＧ２０００におけるプリシンクト、コードブロック、パケット、レイヤについて説明する。画像≧タイル≧サブバンド≧プリシンクト≧コードブロックの大きさ関係がある。
【００４３】
プリシンクトとは、サブバンドの矩形領域で、同じデコンポジションレベルのＨＬ，ＬＨ，ＨＨサブバンドの空間的に同じ位置にある３つの領域の組が１つのプリシンクトとして扱われる。ただし、ＬＬサブバンドでは、１つの領域が１つのプリシンクトとして扱われる。プリシンクトのサイズをサブバンドと同じサイズにすることも可能である。また、プリシンクトを分割した矩形領域がコードブロックである。図９にデコンポジションレベル１における１つのプリシンクトとコードブロックを例示した。図中のプリシンクトと記された空間的に同じ位置にある３つの領域の組が１つのプリシンクトとして扱われる。
【００４４】
プリシンクトに含まれる全てのコードブロックの符号の一部（例えば最上位から３ビット目までの３枚のビットプレーンの符号）を取り出して集めたものがパケットである。符号が空（から）のパケットも許される。コードブロックの符号をまとめてパケットを生成し、所望のプログレッション順序に従ってパケットを並べることにより符号化データを形成する。図２の各タイルに関するＳＯＤ以下の部分がパケットの集合である。
【００４５】
全てのプリシンクト（つまり、全てのコードブロック、全てのサブバンド）のパケットを集めると、画像全域の符号の一部（例えば、画像全域のウェーブレット係数の最上位のビットプレーンから３枚目までのビットプレーンの符号）ができるが、これがレイヤである。したがって、伸長時に復号されるレイヤ数が多いほど再生画像の画質は向上する。つまり、レイヤは画質の単位と言える。全てのレイヤを集めると、画像全域の全てのビットプレーンの符号になる。
【００４６】
さて、デジタルカメラなどに用いられるイメージャは、ある露光時間内の光量を電気信号に変換することによって撮像する。したがって、露光時間内での移動量を無視できないような速度で移動する物体が含まれる静止画像の場合、移動物体の領域では移動方向の高周波成分が減少する。
【００４７】
２次元ウェーブレット変換の説明から理解されるように、ＨＬサブバンド係数は画像の垂直エッジ成分すなわち水平方向の高周波成分であり、ＬＨサブバンド係数は画像の水平エッジ成分すなわち垂直方向の高周波成分である。したがって、ＨＬサブバンド係数より水平方向の高周波成分量（の尺度）Ｙｈを、ＬＨサブハンド係数より垂直方向の高周波成分量（の尺度）Ｙｖをそれぞれ計算することができる。デジタルカメラなどで撮影された静止画像をタイル分割し、各タイル毎に２次元ウェーブレット変換した場合、移動物体が含まれないタイルにおいては一般に図１０の（ｄ）に示すようにＹｈとＹｖの差は小さい。これに対し、水平方向に移動する移動物体が含まれるタイルにおいては、図１０の（ａ）に示すようにＹｈがＹｖに比べかなり減少する。垂直方向に移動する移動物体が含まれるタイルにおいては、図１０の（ｂ）に示すようにＹｈに比べＹｖがかなり減少する。斜め方向に移動する移動物体が含まれるタイルでは、図１０の（ｃ）に示すように、Ｙｈ，Ｙｖともに減少するが、その違いは小さい。本発明は、このような移動物体の移動方向と水平方向及び垂直方向の高周波成分量の変化の関係を利用し、静止画像中の移動物体及びその移動方向を検出する。
【００４８】
以下、本発明のいくつかの実施の形態について図１１乃至図１９を参照して説明する。なお、説明の重複を減らすため、複数の図面において同一部分もしくは対応部分に同一の参照番号を用いる。
【００４９】
《実施の形態１》図１１は、本発明の１つの実施の形態を説明するためのブロック図である。
【００５０】
図１１において、画像ソース５０は、処理対象の静止画像のデータを蓄積している記憶装置やパソコンなどの機器である。この画像ソース５０は、当該画像処理装置の内部にあっても、外部にあってネットワークなどの伝送路を介して当該画像処理装置に接続されるものであってもよい。
【００５１】
ウェーブレット変換処理部１０２は、画像ソース５０より静止画像のデータを取り込み、静止画像に対し重複しない矩形領域（以下、タイル）毎に２次元のウェーブレット変換（例えば、前述の５×３変換）を施す処理を実行する手段である。
【００５２】
Ｙｈ，Ｙｖ算出部１０５は、ウェーブレット変換処理部１０２より出力されるＨＬ，ＬＨサブバンドの係数を取り込み、それを用いて水平方向と垂直方向の高周波成分量Ｙｈ，Ｙｖを計算する処理を実行する手段である。
【００５３】
移動物体検出部１１０は、水平方向、垂直方向の高周波成分量Ｙｈ，Ｙｖに基づいて静止画像中の移動物体を検出する処理を実行する手段である。移動物体検出の単位領域として、タイルと、それより小さい領域を選ぶことができる。まず、単位領域としてタイルが選ばれた場合の動作について、図１２に示すフローチャートを参照して説明する。
【００５４】
１つのタイルが選ばれ（ステップＳ１００）、ウェーブレット変換処理部１０２で同タイルの２次元ウェーブレット変換が行われる（ステップＳ１０１）。
【００５５】
Ｙｈ，Ｙｖ算出部１０５で、そのタイルのＨＬ，ＬＨサブバンド係数を用いて、例えば、（３）式により水平方向の高周波成分量Ｙｈを、（４）式により垂直方向の高周波成分量Ｙをそれぞれ計算する（ステップＳ１０２）。
Ｙh＝ah・Σ|1HL|+bh・Σ|2HL|+ch・Σ|3HL| (3)式
Ｙv＝av・Σ|1LH|+bv・Σ|2LH|+cv・Σ|3LH| (4)式
ただし、ah,bh,ch,av,bv,cvは０以上の定数である。
【００５６】
この例では、デコンポジション・レベル１，２，３のＨＬ，ＬＨサブバンド係数を用いるが、より少ないレベルのＨＬ，ＬＨサブバンド係数（例えば、１ＨＬ，１ＬＨサブバンド係数のみ）、あるいは、より多くのレベルのＨＬ，ＬＨサブバンド係数を用いて計算をしてもよい。
【００５７】
移動物体検出部１１０において、Ｙｈ，Ｙｖの小さいほうの値すなわちＭｉｎ（Ｙｈ，Ｙｖ）と閾値ＴＨ１の比較判定を行う（ステップＳ１０４）。
【００５８】
Ｍｉｎ（Ｙｈ，Ｙｖ）が閾値ＴＨ１より大きければ（ステップＳ１０４，Ｙｅｓ）、図１０の（ｄ）のケースに相当するので、注目しているタイルに移動物体が含まれていないと判断し、処理はステップＳ１１２に進む。
【００５９】
移動物体検出部１１０は、Ｍｉｎ（Ｙｈ，Ｙｖ）が閾値ＴＨ１以下ならば（ステップＳ１０４，Ｎｏ）、図１０の（ａ），（ｂ）又は（ｃ）のケースに相当するので、注目しているタイルに移動物体が含まれていると判断する。そして、移動物体の移動方向を判別するため、Ｙｈ，Ｙｖの大きいほうの値と、小さいほうの値の比Ｍａｘ（Ｙｈ，Ｙｖ）／Ｍｉｎ（Ｙｈ，Ｙｖ）を計算し、その比と閾値ＴＨ２との比較判定を行う（ステップＳ１０６）。比がＴＨ２より大きいならば、図１０の（ａ）又は（ｂ）のケースに相当するので、Ｍｉｎ（Ｙｈ，Ｙｖ）がＹｈならば水平方向に移動する物体が含まれていると判断し、Ｍｉｎ（Ｙｈ，Ｙｖ）がＹｖならば垂直方向に移動する物体が含まれていると判断する（ステップＳ１０８）。比がＴＨ２以下ならば、図１０の（ｃ）のケースに相当するので、斜め方向に移動する物体が含まれていると判断する（ステップＳ１１０）。以上の判断の結果はタイル対応で移動物体検出部１１０に一時的に保存される。
【００６０】
ステップＳ１００からステップＳ１１０の処理が各タイルについて繰り返される。最後のタイルまで処理が終了すると（ステップＳ１１２，Ｙｅｓ）、移動物体検出部１１０は、同じ移動方向の移動物体が含まれていると判断された、接続したタイル群を１つの移動物体の領域として検出する（ステップＳ１１４）。例えば、図１３の（ａ）に示すように右方向に移動する自動車の像が含まれる静止画像が、図示のようにタイル分割されて処理される場合には、図１３の（ｂ）に示すように、左右矢印が記されたタイルは水平方向に移動する移動物体が含まれていると判断され、それらの接続したタイル群（網掛けされた領域）が１つの移動物体の領域として検出される。
【００６１】
移動物体検出の単位領域としてタイルより小さな領域が選ばれた場合の動作について、図１４に示すフローチャートを参照し説明する。なお、図２乃至図６を参照した説明から明らかなように、ウェーブレット変換の各サブバンドの各係数は、原画像の特定の画素（群）と対応関係がある。したがって、デインターリーブされた各サブバンドを矩形領域に分割した場合、その各領域は原画像上の特定の矩形領域と対応している。
【００６２】
まず、１つのタイルが選ばれ（ステップＳ２００）、そのタイルに対しウェーブレット変換処理部１０２で２次元ウェーブレット変換が実行される（ステップＳ２０１）。
【００６３】
Ｙｈ，Ｙｖ算出部１０５において、注目しているタイルを再分割した１つの領域を選択し（ステップＳ２０２）、その領域のＨＬ，ＬＨサブバンド係数を用いて、水平方向の高周波成分量Ｙｈと垂直方向の高周波成分量Ｙをそれぞれ計算する（ステップＳ２０４）。この計算式としては前記の（３）式，（４）式などを用いることができる。
【００６４】
移動物体検出部１１０において、Ｍｉｎ（Ｙｈ，Ｙｖ）と閾値ＴＨ１の比較判定を行う（ステップＳ２０６）。
【００６５】
Ｍｉｎ（Ｙｈ，Ｙｖ）が閾値ＴＨ１より大きければ（ステップＳ２０６，Ｙｅｓ）、注目している領域に移動物体が含まれていないと判断し、処理はステップＳ２１２に進む。
【００６６】
移動物体検出部１１０は、Ｍｉｎ（Ｙｈ，Ｙｖ）が閾値ＴＨ１以下ならば（ステップＳ２０６，Ｎｏ）、注目している領域に移動物体が含まれていると判断する。そして、Ｍａｘ（Ｙｈ，Ｙｖ）／Ｍｉｎ（Ｙｈ，Ｙｖ）を計算し、この比と閾値ＴＨ２との比較判定を行う（ステップＳ２０８）。比がＴＨ２より大きい場合、Ｍｉｎ（Ｙｈ，Ｙｖ）がＹｈならば水平方向に移動する物体が含まれていると判断し、Ｍｉｎ（Ｙｈ，Ｙｖ）がＹｖならば垂直方向に移動する物体が含まれていると判断する（ステップＳ２０９）。比がＴＨ２以下ならば、斜め方向に移動する物体が含まれていると判断する（ステップＳ２１０）。以上の判断の結果はタイルの領域対応に移動物体検出部１１０に一時的に保存される。
【００６７】
ステップＳ２０２からステップＳ２１０の処理がタイル内の各領域について繰り返される。選択されているタイルの最後の領域まで処理が終了すると（ステップＳ２１２，Ｙｅｓ）、ステップＳ２００に戻り、次のタイルについて同様の処理が繰り返される。
【００６８】
最後のタイルの処理が終了すると（ステップＳ２１４，Ｙｅｓ）、移動物体検出部１１０は、同じ移動方向の移動物体が含まれていると判断した領域の接続した領域群を１つの移動物体の領域として検出する（ステップＳ２１６）。
【００６９】
このようなタイルより小さな領域を単位とした移動物体検出によれば、タイル単位の場合よりも移動物体の領域をより精密に検出することができる。また、静止画像を１タイルとしてウェーブレット変換を施す場合にも移動物体検出が可能である。
【００７０】
この実施の形態の画像処理装置の各手段の機能及び装置内の処理は、ハードウェア又はファームウェアにより実現することも、パソコンなどの汎用コンピュータを利用しソフトウェアにより実現することも可能であり、そのいずれの態様も本発明に包含される。また、そのようなプログラムが記録された記録（記憶）媒体も本発明に包含される。
【００７１】
以下に説明する実施の形態から理解されるように、本発明は、圧縮処理に２次元ウェーブレット変換が含まれるＪＰＥＧ２０００のような圧縮方式が利用される画像処理方法及び装置に好適に適用できる。
【００７２】
《実施の形態２》図１５は、本発明の別の実施の形態を説明するためのブロック図である。この実施の形態は、ＪＰＥＧ２０００のアルゴリズムにより静止画像の圧縮処理を実行する過程で、２次元ウェーブレット変換により生成されるＨＬ，ＬＨサブバンド係数を利用して移動物体の検出を行うとともに、検出された移動物体の領域の画質を他の領域より向上させた符号化データを生成する。
【００７３】
図１５において、圧縮処理部１００は、図示しない画像ソースより、デジタルカメラで撮影されたような静止画像の画像データを取り込み、ＪＰＥＧ２０００のアルゴリズムにより圧縮処理を行う手段である。この圧縮処理部１００の内部構成は、図１に示したものと同様であるので図示しない。
【００７４】
Ｙｈ，Ｙｖ算出部１０５と移動物体検出部１１０は、前記実施の形態１のものと同様である。ただし、Ｙｈ，Ｙｖ算出部１０５は水平方向、垂直方向の高周波成分量Ｙｈ，Ｙｖの計算に、圧縮処理部１００における圧縮処理過程で生成される量子化前のＨＬ，ＬＨサブバンドの係数を用いる。なお、圧縮対象の静止画像がカラー画像の場合には、Ｙ（輝度）コンポーネント（ＲＧＢコンポーネントの場合はＧコンポーネント）のＨＬ，ＬＨサブバンド係数のみが高周波成分量Ｙｈ，Ｙｖの計算に用いられる。
【００７５】
移動物体検出部１１０は、その水平方向、垂直方向の高周波成分量Ｙｈ，Ｙｖに基づいて静止画像中の移動物体を検出するが、その検出結果は圧縮処理部１００にも与えられ、圧縮処理部１００において移動体物体の領域の画質を他の領域より向上させるように圧縮条件が制御される。
【００７６】
移動物体検出の単位領域として、タイルと、それより小さい領域（例えば、図１０に関連して説明したコードブロックや複数コードブロックに対応した領域、あるいはプリシンクトに対応した領域など）を選ぶことができる。単位領域としてタイルが選ばれた場合の移動物体検出動作は図１２に示した通りである。ただし、ステップＳ１０１は圧縮処理部１００内のウェーブレット変換処理手段で実行される。タイルより小さい領域が単位領域として選ばれた場合の動作は図１４に示した通りであるが、ステップＳ２０１は圧縮処理部１００内のウェーブレット変換処理手段で実行される。
【００７７】
圧縮処理部１００には、ステップＳ１０４（図１２）又はステップＳ２０６（図１４）の判断結果が移動物体検出部１１０より送られる。したがって、圧縮処理の制御のみを目的とする場合には、ステップＳ１０６〜Ｓ１１０，Ｓ１１４（図１２）又はステップＳ２０８〜Ｓ２１０，Ｓ２１６（図１４）の処理を省くことができる。
【００７８】
圧縮処理部１００においては、移動物体が含まれていると判断されたタイル又は領域を、移動物体が含まれていないタイル又は領域より高い画質となるように圧縮処理を実行する。ＪＰＥＧ２０００では、注目した領域（ＲＯＩ；Regionof Interest）の画質を他の領域より向上させる選択的領域画質向上機能がある。ＪＰＥＧ２０００の基本方式（JPEG2000 Part１）では、ウェーブレット係数の符号化前に、注目した領域（ＲＯＩ領域）のウェーブレット係数値を上位ビット側へシフトし、その領域外のウェーブレット係数値を下位ビット側へシフトするMax Shift方式が採用されている。ＪＰＥＧ２０００では、ウェーブレット係数の量子化段階で、ＲＯＩ領域のウェーブレット係数値を他の領域よりも細かい量子化ステップで量子化することによっても、ＲＯＩ領域の画質を向上させることができる。圧縮処理部１００は、上記いずれかの方法により、移動物体が含まれていると判断されたタイル又は領域の画質を向上させるように、量子化段階以降の処理を実行する。かくして、移動物体の領域が他の領域より高画質の符号化データが圧縮処理部１００より出力される。すなわち、この実施の形態によれば、移動物体の領域の画質を落とすことなく、符号化データの符号量を削減することができる。
【００７９】
この実施の形態の画像処理装置の各手段の機能又は装置内の処理は、ハードウェア又はファームウェアにより実現することも、パソコンなどの汎用コンピュータを利用しソフトウェアにより実現することも可能であり、そのいずれの態様も本発明に包含される。また、そのようなプログラムが記録された記録（記憶）媒体も本発明に包含される。
【００８０】
《実施の形態３》図１６は、本発明の別の実施の形態を説明するためのブロック図である。この実施の形態は、前記実施の形態２と同様、ＪＰＥＧ２０００のアルゴリズムにより静止画像の圧縮処理を実行する過程で、２次元ウェーブレット変換により得られるＨＬ，ＬＨサブバンド係数を利用して移動物体の検出を行うとともに、圧縮処理により生成された符号化データを、検出された移動物体の領域の画質を他の領域より向上させた符号化データに変換する。
【００８１】
この実施の形態では、符号加工部１１５が追加され、移動物体検出部１１０の検出結果が符号加工部１１５に与えられる。静止画像の画像データは、圧縮処理部１００によって、ロスレス圧縮されるか、ロスレスに近い低圧縮率でロシィ圧縮され、得られた符号化データは符号加工部１１５に入力される。
【００８２】
ＪＰＥＧ２０００の符号化データは、符号状態のままで符号の廃棄などの加工が可能である。符号加工部１１５は、圧縮処理部１００により生成された符号化データを移動物体検出結果に従って加工することにより、移動物体が含まれるタイル又は領域の画質を低下させることなく全体の符号量を減少させた符号化データに変換する。これ以外は、前記実施の形態２と同様であるので、その説明は省略する。
【００８３】
なお、圧縮処理部１００と符号加工部１１５の間に記憶手段を介在させてもよい。また、符号加工部１１５を他の部分とネットワークなどで接続する態様も可能である。このような態様も本発明に包含される。
【００８４】
この実施の形態の画像処理装置の各手段の機能又は装置内の処理は、ハードウェア又はファームウェアにより実現することも、パソコンなどの汎用コンピュータを利用しソフトウェアにより実現することも可能であり、そのいずれの態様も本発明に包含される。また、そのようなプログラムが記録された記録（記憶）媒体も本発明に包含される。
【００８５】
《実施の形態４》図１７は、本発明の別の実施の形態を説明するためのブロック図である。この実施の形態は、ＪＰＥＧ２０００の符号化データの伸長処理により得られるＨＬ，ＬＨサブバンド係数を利用して移動物体の検出を行う。
【００８６】
図１７において、伸長処理部２００は、図示しない画像ソースよりＪＰＥＧ２０００のアルゴリズムにより圧縮された静止画像の符号化データを取り込み、伸長処理を行う手段である。この伸長処理部２００の内部構成は、図１に示したものと同様であるので図示しない。
【００８７】
Ｙｈ，Ｙｖ算出部１０５と移動物体検出部１１０は、前記実施の形態１のものと同様である。ただし、Ｙｈ，Ｙｖ算出部１０５は水平方向、垂直方向の高周波成分量Ｙｈ，Ｙｖの計算に、伸長処理部２００による伸長処理によって生成される逆量子化後のＨＬ，ＬＨサブバンド係数を用いる。なお、静止画像がカラー画像の場合には、Ｙコンポーネント（ＲＧＢコンポーネントの場合はＧコンポーネント）のＨＬ，ＬＨサブバンド係数のみが高周波成分量Ｙｈ，Ｙｖの計算に用いられる。
【００８８】
移動物体検出部１１０は、その水平方向、垂直方向の高周波成分量Ｙｈ，Ｙｖに基づいて静止画像中の移動物体を検出する。前記実施の形態２，３と同様に、移動物体検出の単位領域として、タイルと、それより小さい領域（例えば図１０に関連して説明したコードブロックや複数コードブロックに対応する領域、あるいはプリシンクトに対応した領域など）を選ぶことができる。単位領域としてタイルが選ばれた場合又はタイルより小さな領域が選ばれた場合の移動物体検出動作はそれぞれ図１２又は図１４に示した通りであるが、ステップＳ１０１又はステップ２０１は、伸長処理部２００内のエントロピー復号化、逆量子化によるサブバンド係数生成の処理に置き換わる。
【００８９】
この実施の形態の画像処理装置の各手段機能及び装置内の処理は、ハードウェア又はファームウェアにより実現することも、パソコンなどの汎用コンピュータを利用しソフトウェアにより実現することも可能であり、そのいずれの態様も本発明に包含される。また、そのようなプログラムが記録された記録（記憶）媒体も本発明に包含される。
【００９０】
《実施の形態５》図１８は、本発明の別の実施の形態を説明するためのブロック図である。この実施の形態は、ＪＰＥＧ２０００の符号化データの伸長処理によって生成されるＨＬ，ＬＨサブバンド係数を利用して移動物体の検出を行うとともに、符号化データを、検出された移動物体の領域を高画質、それ以外の領域を低画質にした符号化データに変換する。
【００９１】
この実施の形態においては、前記実施の形態３と同様の符号加工部１１５が追加され、これ以外は前記実施の形態４と同様である。
【００９２】
伸長処理部２００は、図示しない画像ソースから、ＰＥＧ２０００のアルゴリズムによりロスレス圧縮された、又はロスレスに近いロシィ圧縮された静止画像の符号化データを取り込み、伸長処理を行う。この伸長処理で生成されたＨＬ，ＬＨサブバンド係数を用いて、Ｙｈ，Ｙｖ算出部１０５で水平方向、垂直方向の高周波成分量Ｙｈ，Ｙｖが計算され、それを用いて移動物体検出部１１０で移動物体の領域が検出される。移動物体検出処理動作は前記実施の形態４と同様である。各タイル又はそれより小さな各領域に対する移動物体検出結果（図１２のステップＳ１０４又は図１４のステップＳ２０６の結果）は符号加工部１１５に与えられ、符号加工部１１５において、前記実施の形態３と同様に、移動物体が含まれると判断されたタイル又は領域が高画質、それ以外の領域が低画質となるように符号化データの加工が行われる。
【００９３】
なお、符号化データの加工のみを目的とする場合には、図１２のステップＳ１０６〜Ｓ１１０，Ｓ１１４の処理、又は図１４のステップＳ２０８〜Ｓ２１０，Ｓ２１６の処理を省くことができる。
【００９４】
この実施の形態の画像処理装置の各手段の機能及び装置内の処理は、ハードウェア又はファームウェアにより実現することも、パソコンなどの汎用コンピュータを利用しソフトウェアにより実現することも可能であり、そのいずれの態様も本発明に包含される。また、そのようなプログラムが記録された記録（記憶）媒体も本発明に包含される。
【００９５】
《実施の形態６》図１９は、本発明の別の実施の形態を説明するためのブロック図である。この実施の形態に係る画像処理装置は撮像装置、より具体的にはデジタルカメラなどの電子カメラ装置である。
【００９６】
図１９において、３００は光学レンズ、絞り機構、シャッター機構などから構成される一般的な撮像光学系である。３０１はＣＣＤ型又はＭＯＳ型のイメージャであり、撮像光学系３００により結像される光学像を色分解してから光量に応じた電気信号に変換する。３０２はイメージャ３０１の出力信号をサンプリングしてデジタル信号に変換するＣＤＳ・Ａ／Ｄ変換部であり、相関二重サンプリング（ＣＤＳ）回路とＡ／Ｄ変換回路からなる。
【００９７】
３０３は画像プロセッサであり、例えばプログラム（マイクロコード）で制御される高速のデジタル信号プロセッサからなる。この画像プロセッサ３０３は、ＣＤＳ・Ａ／Ｄ変換部３０２より入力する画像データに対するガンマ補正処理、ホワイトバランス調整処理、エッジ強調などのためのエンハンス処理のような信号処理のほか、イメージャ３０１、ＣＤＳ・Ａ／Ｄ変換部３０２、表示部３０４を制御し、また、オートフォーカス制御、自動露出制御、ホワイトバランス調整などのための情報の検出などを行う。表示部３０４は例えば液晶表示装置であり、モニタリング画像（スルー画像）や撮影画像などの画像の表示、その他の情報の表示などに利用される。
【００９８】
以上に説明した撮像光学系３００、イメージャ３０１、ＣＤＳ・Ａ／Ｄ変換部３０２及び画像プロセッサ３０３は、静止画像を撮影する撮像手段を構成している。
【００９９】
３０８はＪＰＥＧ２０００のアルゴリズムによる画像データの圧縮処理と符号化データの伸長処理を行う圧縮／伸長処理部である。３１２は記録（記憶）媒体３１３に対する情報の書き込み／読み出しを行う媒体記録部である。記録媒体３１３は例えば各種メモリカードである。３１４はインターフェース部である。この画像処理装置は、インターフェース部３１４を介し、有線又は無線の伝送路あるいはネットワークを通じ、外部のパソコンなどと情報の交換を行うことができる。
【０１００】
３０６はシステムコントローラであり、マイクロコンピュータからなる。このシステムコントローラ３０６は、操作部３０７から入力されるユーザの操作情報や画像プロセッサ３０３から与えられる情報などに応答して、撮像光学系３００のシャッター機構、絞り機構、ズーミング機構の、画像プロセッサ３０３、圧縮／伸長処理部３０８、媒体記録部３１２などの制御を行う。３０５はメモリであり、画像データやその符号化データなどの一時記憶域、画像プロセッサ３０３やシステムコントローラ３０６、圧縮／伸長処理部３０８媒体記録部３１２などの作業記憶域として利用される。
【０１０１】
操作部３０７は、電子カメラ装置の操作のための一般的な操作ボタン（スイッチ）のほかに、移動物体検出に関連した指示を入力するための操作ボタンも備える。
【０１０２】
通常の撮影動作は次の通りである。操作部３０７のレリーズボタンが押下されると、システムコントローラ３０６より撮影指示が画像プロセッサ３０３に与えられ、画像プロセッサ３０３は静止画像撮影の条件でイメージャ３０１を駆動する。撮影された静止画像のデータは画像プロセッサ３０３を経由してメモリ３０５に記憶される。この画像データは、システムコントローラ３０６の制御により、予め指定された、又はデフォルトの圧縮率で、圧縮／伸長処理部３０８により圧縮され、その符号化データは媒体記録部３１２により記録媒体３１３に記録される。
【０１０３】
前記実施の形態１（図１１）に対応した動作モードを指定することもできる。この動作モードにおいては、撮影された画像データは圧縮／伸長処理部３０８で圧縮されるが、圧縮過程で生成されるＨＬ，ＬＨサブバンド係数を利用した移動物体検出処理がシステムコントローラ３０６で実行される。すなわち、圧縮／伸長処理部３０８の２次元ウェーブレット変換機能が図１１中のウェーブレット変換処理部１０２として利用され、図１１中のＹｈ，Ｙｖ算出部１０５及び移動物体検出部１１０の機能はシステムコントローラ３０６上でプログラムにより実現される。移動物体の検出結果は、符号化データに付加されて（例えばメインヘッダ又はタイルヘッダにコメントとして記述される）、符号化データとともに記録媒体３１３に記録される。
【０１０４】
前記実施の形態２（図１５）に対応した動作モードを指定することも可能である。この動作モードにおいては、撮影された画像データは圧縮／伸長処理部３０８で圧縮されるが、この圧縮処理の過程で生成されるＨＬ，ＬＨサブバンド係数を利用した移動物体検出処理がシステムコントローラ３０６で実行される。すなわち、図１５中のＹh，Ｙｖ算出部１０５と移動物体検出部１１０の機能は、システムコントローラ３０６上でプログラムにより実現される。そして、移動体検出結果に従って、圧縮／伸長処理部３０８では、移動物体が含まれると判断されたタイル又はより小さな領域の画質を向上させるような圧縮処理が行われる。生成された符号化データは記録媒体３１３に記録される。
【０１０５】
前記実施の形態３（図１６）に対応した動作モードも指定できる。この動作モードにおいては、撮影された画像データは圧縮／伸長処理部３０８で圧縮されたメモリ３０５又は記録媒体３１３に記録される。この圧縮処理の過程で生成されるＨＬ，ＬＨサブバンド係数を利用した移動物体検出処理がシステムコントローラ３０６で実行され、その検出結果もメモリ３０５に記録される。その後、メモリ３０５又は記録媒体３１３に記録されている符号化データに対し、移動物体検出結果に従った符号加工処理が実行され、移動物体の領域の画質を落とすことなく符号量を削減した符号化データが生成され、この符号化データが記録媒体３１３に記録される。すなわち、図１６中の符号加工部１１５の機能はシステムコントローラ３０６上でプログラムにより実現される。
【０１０６】
前記実施の形態４（図１７）に対応した動作モードを指定することもできる。この動作モードにおいては、指定された画像の符号化データが記録媒体３１３より読み出され、それが圧縮／伸長処理部３０８により伸長処理される。この伸長処理の過程で生成されるＨＬ，ＬＨサブバンド係数を利用した移動物体検出処理がシステムコントローラ３０６で実行される。移動物体検出結果は、例えば伸長された画像データとともに表示部３０４に表示され、あるいは、元の符号化データに付加されて（例えばメインヘッダ又はタイルヘッダにコメントとして記述される）、符号化データとともに記録媒体に記録される。
【０１０７】
前記実施の形態５（図１８）に対応した動作モードを指定することも可能である。この動作モードが指定された場合には、指定された画像の符号化データが媒体記録部３１２により記録媒体３１３からメモリ３０５に読み出される。この符号化データは圧縮／伸長処理部３０８により伸長されるが、その過程で生成されるＨＬ，ＬＨサブバンド係数を利用した移動物体検出処理と、その結果に従った符号化データの加工がシステムコントローラ３０６で実行される。すなわち、図１８中のＹｈ，Ｙｖ算出部１０５、移動物体検出部１１０及び符号加工部１１５の機能がシステムコントローラ３０６上でプログラムにより実現される。メモリ３０５に符号加工後の符号化データが生成され、これは媒体記録部３１２により記録媒体３１３に記録される。
【０１０８】
なお、圧縮／伸長処理部３０８を例えば画像プロセッサ３０３上でプログラムにより実現してもよい。また、Ｙｈ，Ｙｖ算出部１０５、移動物体検出部１１０、符号加工部１１５に対応するハードウェア又はファームウェアを別に設けることもできる。
【０１０９】
前記実施の形態２乃至６における圧縮方式は必ずしもＪＰＥＧ２０００のみに限定されるわけではなく、圧縮過程に２次元ウェーブレット変換が含まれる他の圧縮方式も利用し得る。
【０１１０】
【発明の効果】
以上の説明から明らかなように、本発明によれば、デジタルカメラなどによって撮影された静止画像中の移動物体の領域及び移動方向を、時間的に前後した画像を参照することなく、検出することができる。圧縮された静止画像中の移動物体とその移動方向を検出することができる。静止画像中の移動物体の画質を落とすことなく、静止画像の符号化データの符号量を削減することができる。圧縮過程に２次元ウェーブレット変換を含むＪＰＥＧ２０００のような圧縮方式による圧縮処理や、その符号化データの伸長処理を伴う場合には、圧縮処理又は伸長処理で生成されるＨＬ，ＬＨサブバンド係数を移動物体検出に利用することにより、装置構成を単純化することができる、等々の効果を得られる。
【図面の簡単な説明】
【図１】ＪＰＥＧ２０００のアルゴリズムを説明するための簡略化したブロック図である。
【図２】ＪＰＥＧ２０００の符号化データのフォーマットを示す図である。
【図３】タイル画像の一例を示す図である。
【図４】図３のタイル画像に垂直方向のウェーブレット変換を行った結果を示す図である。
【図５】図４の係数列に水平方向のウェーブレット変換を行った結果を示す図である。
【図６】図５の係数列をデインターリーブした図である。
【図７】図６のＬＬサブバンドに２次元のウェーブレット変換を行った結果を示す図である。
【図８】３レベルの２次元ウェーブレット変換を行った場合の各レベルのサブバンド係数を示す図である。
【図９】プリシンクトとコードブロックを説明するための図である。
【図１０】水平方向及び垂直方向の高周波成分量と移動方向との関連を示す図である。
【図１１】本発明の実施の形態１を説明するためのブロック図である。
【図１２】タイル単位の移動物体検出処理を説明するためのフローチャートである。
【図１３】タイル単位の移動物体検出の例を示す図ある。
【図１４】タイルより小さな領域単位の移動物体検出処理を説明するためのフローチャートである。
【図１５】本発明の実施の形態２を説明するためのブロック図である。
【図１６】本発明の実施の形態３を説明するためのブロック図である。
【図１７】本発明の実施の形態４を説明するためのブロック図である。
【図１８】本発明の実施の形態５を説明するためのブロック図である。
【図１９】本発明の実施の形態６を説明するためのブロック図である。
【符号の説明】
１００圧縮処理部
１０２ウェーブレット変換処理部
１０５Ｙｈ，Ｙｖ算出部
１１０移動物体検出部
１１５符号加工部
２００伸長処理部
３００撮像光学系
３０１イメージャ
３０２ＣＤＳ・Ａ／Ｄ変換部
３０３画像プロセッサ
３０６システムコントローラ
３０８圧縮／伸長処理部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to image processing, and more particularly to detection of a moving object in a still image captured by a digital camera or the like and image processing related thereto.
[0002]
[Prior art]
In various image processing fields, it may be necessary to detect a moving object in an image. In the case of a moving image, a moving object can be detected relatively easily by comparing preceding and succeeding images in time (for example, see Patent Document 1). However, in the case of a still image, if an image that is close in time cannot be prepared, such a method cannot be applied.
[0003]
Images are often compressed prior to recording or transmission. JPEG is widely used for compressing still images, but JPEG2000 (ISO / IEC FCD 15444-1) has attracted attention as an alternative compression method (for example, see Non-Patent Document 1).
[0004]
[Patent Document 1]
JP-A-10-145776
[Non-patent document 1]
Yasuyuki Nomizu, "Next Generation Image Coding Method JPEG2000",
Trikeps, Inc., February 13, 2001
[0005]
[Problems to be solved by the invention]
Accordingly, it is an object of the present invention to provide a novel method and apparatus for detecting a moving object in a still image captured by a digital camera or the like. It is another object of the present invention to provide a novel method and apparatus for detecting a moving object in a still image and compressing the still image according to the detection result. Another object of the present invention is to provide a novel method and apparatus for detecting a moving object in a compressed still image. Another object of the present invention is to provide a novel method and apparatus for converting encoded data obtained by compressing a still image in accordance with the result of detection of a moving object.
[0006]
[Means for Solving the Problems]
As described in claim 1, the image processing method according to the present invention uses the HL subband coefficient and the LH subband coefficient obtained by applying a two-dimensional wavelet transform to a still image for each area of the still image. And a process of calculating a moving object in a still image based on the high-frequency component amounts.
[0007]
Another feature of the image processing method of the present invention is that, as described in claim 2, in addition to the configuration of claim 1, the image processing method further includes a compression process of a still image including a two-dimensional wavelet transform. The HL subband coefficient and the LH subband coefficient generated by the two-dimensional wavelet transform in the processing are used for calculating the high frequency component amount.
[0008]
According to another feature of the image processing method of the present invention, as described in claim 3, in the configuration according to claim 2, the area of the moving object is different from the area of the moving object according to the detection result of the moving object. Is to generate coded data of higher image quality than the region of.
[0009]
Another feature of the image processing method according to the present invention is that, as described in claim 4, in addition to the configuration according to claim 2, the encoded data generated by the compression processing is determined in accordance with a detection result of a moving object. The processing further includes a process of converting the encoded data into encoded data in which the area of the moving object has higher image quality than other areas.
[0010]
Another feature of the image processing method of the present invention is that, as described in claim 5, in addition to the configuration of claim 1, encoding of a still image compressed by a compression process including a two-dimensional wavelet transform It further includes a data decompression process, and the HL subband coefficient and the LH subband coefficient generated by the decompression process are used for calculating the high frequency component amount.
[0011]
Another feature of the image processing method according to the present invention is that, as described in claim 6, in addition to the configuration according to claim 5, the encoded data is processed according to a detection result of a moving object, and the encoding is performed. Another object of the present invention is to further include a process of converting data into code data in which the area of the moving object has higher image quality than other areas.
[0012]
Another feature of the image processing method of the present invention is, as described in claim 7, the same as the region to which the two-dimensional wavelet transform is applied in the configuration according to any one of claims 1 to 6. Is to calculate the amount of high-frequency components for each region.
[0013]
Another feature of the image processing method of the present invention is that, as described in claim 8, in the configuration according to any one of claims 1 to 6, the area is smaller than the area to which the two-dimensional wavelet transform is applied. It is to calculate the amount of high frequency components for each area.
[0014]
Another feature of the image processing method according to the present invention is that, as described in claim 9, in the configuration according to claim 7 or 8, the amount of high frequency components in the horizontal direction and the vertical direction in the moving object detection processing Is to determine an area in which the smaller high-frequency component amount is equal to or less than a predetermined threshold value as an area including a moving object.
[0015]
Another feature of the image processing method according to the present invention is that, in the configuration according to the ninth aspect, in the processing of the moving object detection, the magnitude of the high frequency component amount in the horizontal direction and the vertical direction is reduced. It is to determine a moving direction of a moving object based on the relationship. An image processing apparatus according to an embodiment of the present invention includes a wavelet transform processing unit that performs a two-dimensional wavelet transform on a still image, an HL subband coefficient generated by the two-dimensional wavelet transform, and an LH subband. From the band coefficient, a high-frequency component amount calculating unit that calculates the horizontal and vertical high-frequency component amounts for each region of the still image, and a moving object detecting unit that detects a moving object in the still image based on the high-frequency component amount Is to have.
[0016]
According to another feature of the image processing apparatus of the present invention, as described in claim 12, in addition to the configuration of claim 11, the image processing apparatus further includes a compression processing unit that performs a compression processing of a still image, The conversion processing means is included in the compression processing means.
[0017]
According to another feature of the image processing apparatus of the present invention, as described in claim 13, in the configuration of claim 12, the compression processing unit performs the processing according to a detection result of the moving object by the moving object detection unit. Another object of the present invention is to generate encoded data of a moving object region having higher image quality than other regions.
[0018]
Another feature of the image processing apparatus according to the present invention is that, as described in claim 14, in addition to the configuration described in claim 12, the encoded data generated by the compression processing is converted by the moving object detection unit. Another object of the present invention is to include code processing means for processing the encoded data in accordance with the detection result of the moving object and converting the coded data into coded data in which the area of the moving object is higher in quality than other areas.
[0019]
The image processing apparatus according to the present invention, as described in claim 15, decompression processing means for performing decompression processing on encoded data of a still image compressed by compression processing including two-dimensional wavelet transform, and the decompression processing Means for calculating the horizontal and vertical high frequency component amounts for each region of the still image from the HL subband coefficients and the LH subband coefficients generated by And moving object detecting means for detecting the moving object.
[0020]
Another feature of the image processing apparatus of the present invention is that, as described in claim 16, in addition to the configuration of claim 15, the encoded data is converted according to a result of detection of a moving object by the moving object detection means. The present invention has code processing means for processing the coded data to convert the coded data into coded data having a higher image quality in an area of a moving object than in other areas.
[0021]
Another feature of the image processing apparatus according to the present invention is that, in the configuration according to any one of claims 11 to 16, the high-frequency component amount calculating means includes a two-dimensional wavelet transform. Is to calculate the high-frequency component amount for each of the same regions as the region to which is applied.
[0022]
Another feature of the image processing apparatus according to the present invention is that, in the configuration according to any one of claims 11 to 16, the high frequency component amount detecting means is a two-dimensional wavelet transform. Is to calculate the amount of high frequency components for each region smaller than the region to which is applied.
[0023]
According to a nineteenth aspect of the present invention, there is provided an imaging apparatus including the respective units according to any one of the eleventh to sixteenth aspects and an imaging unit for capturing a still image. .
[0024]
BEST MODE FOR CARRYING OUT THE INVENTION
Prior to the description of the embodiments of the present invention, JPEG2000 will be outlined to the extent necessary for its understanding.
[0025]
FIG. 1 is a block diagram showing the flow of the basic compression / expansion processing of JPEG2000. Image data to be subjected to compression processing is divided into non-overlapping rectangular areas (tiles) for each component, and is processed in tile units for each component. However, it is also possible to process the entire image as one tile.
[0026]
Each tile image of each component is subjected to color space conversion from RGB data or CMY data to YCrCb data by a color space conversion / inverse conversion unit 1 for the purpose of improving a compression ratio. In some cases, this color space conversion is omitted.
[0027]
The tile image after the color space conversion is subjected to a two-dimensional wavelet transform (discrete wavelet transform) by the wavelet transform / inverse transform unit 2 and is decomposed into a plurality of subbands. The wavelet coefficients are quantized by the quantization / dequantization unit 3 for each subband. JPEG2000 is capable of both lossless compression (lossless compression) and irreversible compression (lossy compression). In the case of lossless compression, the quantization step width is always 1, and quantization is not substantially performed at this stage.
[0028]
Each of the quantized subband coefficients is entropy-encoded by an entropy encoding / decoding unit 4. For this entropy coding, a block-based bit plane coding method called EBCOT (Embedded Block Coding with Optimized Truncation) including block division, coefficient modeling, and binary arithmetic coding is used. The bit plane of each subband coefficient after quantization is encoded for each block called a code block from the upper bits to the lower bits.
[0029]
In the tag processing unit 5, the codes of the code blocks generated in the entropy encoding / decoding unit 4 are put together to create a packet, and then the packets are arranged according to the progression order and necessary tag information is added. As a result, encoded data in a predetermined format is created. JPEG2000 defines five types of progression order based on a combination of a resolution level, a position (precinct), a layer, and a component (color component) for code order control.
[0030]
FIG. 2 shows the format of the JPEG2000 encoded data generated in this manner. As shown in FIG. 2, the encoded data starts with a tag called an SOC marker indicating the start of the encoded data, followed by tag information called a main header that describes encoding parameters, quantization parameters, and the like. After that, code data for each tile follows. Code data for each tile starts with a tag called a SOT marker, tag information called a tile header (Tile Header), a tag called an SOD marker, and tile data (Tile Data) containing a code string of each tile. Is done. After the last tile data, a tag called an EOC marker indicating the end is placed.
[0031]
Decompression processing is the reverse of compression processing. The encoded data is decomposed by the tag processing unit 5 into a code string of each tile of each component. This code sequence is entropy-decoded by the entropy coding / decoding unit 4. The decoded wavelet coefficients are inversely quantized by the quantization / inverse quantization unit 3 and then subjected to two-dimensional inverse wavelet transform by the wavelet transform / inverse transforming unit 2, whereby each component of each tile is subjected to inverse wavelet transform. The image is played. Each tile image of each component is subjected to inverse color conversion processing by the color space conversion / inverse conversion unit 1 and returned to a tile image composed of components such as RGB.
[0032]
Here, the two-dimensional wavelet transform directly related to the present invention will be further described. FIGS. 3 to 6 show that a 16 × 16 pixel tile image of a monochrome image (or one component of a color image) is subjected to a wavelet transform called a 5 × 3 transform adopted in JPEG2000 in the vertical and horizontal directions. FIG. 4 is a diagram for explaining a process of applying the method to the present invention.
[0033]
FIG. 3 shows a tile image before conversion. As shown in the drawing, XY coordinates are taken, and for a certain x, the pixel value of a pixel whose Y coordinate is y is represented as P (y) (0 ≦ y ≦ 15). In JPEG2000, a coefficient C (2i + 1) is obtained by applying a high-pass filter in the vertical direction (Y-coordinate direction) centering on a pixel whose Y coordinate is an odd number (y = 2i + 1). Next, a coefficient C (2i) is obtained by performing a low-pass filter centering on the pixel whose Y coordinate is an even number (y = 2i) (this is performed for all x). Here, the high-pass filter and the low-pass filter are expressed by Equations (1) and (2), respectively. Floor (x) in the expression is a floor function of x (a function that replaces a real number x with an integer that does not exceed x and is closest to x).
[0034]
C (2i + 1) = P (2i + 1) −floor ((P (2i) + P (2i + 2)) / 2) Equation (1)
C (2i) = P (2i) + floor ((C (2i-1) + C (2i + 1) +2) / 4) Equation (2)
[0035]
Note that, at the end of the image, there may be no pixel group adjacent to the center pixel, and in this case, a missing pixel value is compensated for by a method called “mirroring”. Mirroring is an operation that literally folds a pixel value line-symmetrically around a boundary and regards the wrapped value as a value of an adjacent pixel group.
[0036]
If the coefficient obtained by the high-pass filter is denoted by H and the coefficient obtained by the low-pass filter is denoted by L, the image of FIG. 3 is converted into an array of L coefficients and H coefficients as shown in FIG. You.
[0037]
Subsequently, a high-pass filter is applied to the coefficient array of FIG. 4 in the horizontal direction, centering on the coefficient whose X coordinate is odd (y = 2i + 1), and then the coefficient whose X coordinate is even (x = 2i). (This is performed for all y. In this case, P (2i) and the like in the above expression are read as representing coefficient values).
[0038]
The coefficient obtained by applying a low-pass filter around the L coefficient is LL, the coefficient obtained by applying a high-pass filter around the L coefficient is HL, the coefficient obtained by applying a low-pass filter around the H coefficient is LH, and the H coefficient If the coefficients obtained by applying a high-pass filter with respect to are represented by HH, the coefficient array of FIG. 4 is converted into a coefficient array as shown in FIG. Here, the coefficient group given the same symbol is called a sub-band, and FIG. 5 is composed of four sub-bands.
[0039]
With the above processing, one wavelet transformation (one decomposition (decomposition)) is completed. FIG. 6 shows a collection of wavelet coefficients for each subband. Arranging the coefficients in this manner is called deinterleaving, and arranging them in the state shown in FIG. 5 is called interleaving.
[0040]
The second wavelet transform is performed by a similar process, regarding the LL subband as an original image. When the processing result is deinterleaved, coefficients of subbands as shown in FIG. 7 are obtained. Note that the prefixes 1 and 2 of the coefficients in FIGS. 6 and 7 indicate the number of wavelet transforms (decomposition level) until the coefficients are obtained.
[0041]
FIG. 8 shows the state of subband decomposition when the number of decomposition levels = 3. Note that the number enclosed in parentheses in each subband shown in FIG. 8D indicates the resolution level.
[0042]
Here, precincts, code blocks, packets, and layers in JPEG2000 will be described. Image ≧ tile ≧ subband ≧ precinct ≧ code block size relationship.
[0043]
A precinct is a rectangular area of a subband, and a set of three areas at the same spatial position in the HL, LH, and HH subbands of the same decomposition level is treated as one precinct. However, in the LL subband, one region is treated as one precinct. The size of the precinct can be the same as the size of the subband. A rectangular area obtained by dividing the precinct is a code block. FIG. 9 illustrates one precinct and a code block at the decomposition level 1. A set of three areas at the same spatial position, which is described as a precinct in the drawing, is treated as one precinct.
[0044]
A packet is obtained by extracting and collecting a part of the codes of all the code blocks included in the precinct (for example, the codes of three bit planes from the highest bit to the third bit). Packets with an empty code are also allowed. Packets are generated by grouping the codes of the code blocks, and the encoded data is formed by arranging the packets in a desired progression order. The portion below the SOD for each tile in FIG. 2 is a set of packets.
[0045]
When packets of all precincts (that is, all code blocks and all subbands) are collected, a part of the code of the entire image (for example, the bits from the most significant bit plane to the third bit of the wavelet coefficient of the entire image) Plane sign), which is the layer. Therefore, as the number of layers decoded at the time of decompression increases, the image quality of the reproduced image improves. That is, a layer can be said to be a unit of image quality. When all the layers are collected, the codes of all the bit planes in the entire image are obtained.
[0046]
An imager used in a digital camera or the like captures an image by converting a light amount within a certain exposure time into an electric signal. Therefore, in the case of a still image including an object that moves at a speed such that the amount of movement within the exposure time cannot be ignored, the high-frequency component in the moving direction decreases in the area of the moving object.
[0047]
As understood from the description of the two-dimensional wavelet transform, the HL subband coefficient is a vertical edge component of the image, that is, a high frequency component in the horizontal direction, and the LH subband coefficient is a horizontal edge component of the image, that is, a high frequency component in the vertical direction. . Therefore, the horizontal high-frequency component amount (scale) Yh can be calculated from the HL subband coefficient, and the vertical high-frequency component amount (scale) Yv can be calculated from the LH subhand coefficient. When a still image captured by a digital camera or the like is divided into tiles and subjected to two-dimensional wavelet transform for each tile, the difference between Yh and Yv is generally obtained for tiles that do not include a moving object, as shown in FIG. Is small. On the other hand, in a tile including a moving object that moves in the horizontal direction, Yh is considerably smaller than Yv as shown in FIG. In a tile including a moving object that moves in the vertical direction, Yv is considerably reduced as compared with Yh as shown in FIG. In a tile including a moving object that moves in an oblique direction, both Yh and Yv decrease as shown in FIG. 10C, but the difference is small. The present invention detects the moving object in the still image and the moving direction of the moving object by utilizing the relationship between the moving direction of the moving object and the change in the high frequency component amount in the horizontal and vertical directions.
[0048]
Hereinafter, some embodiments of the present invention will be described with reference to FIGS. Note that the same reference numerals are used for the same or corresponding parts in a plurality of drawings to reduce duplication of description.
[0049]
Embodiment 1 FIG. 11 is a block diagram for explaining one embodiment of the present invention.
[0050]
In FIG. 11, an image source 50 is a device such as a storage device or a personal computer that stores data of a still image to be processed. The image source 50 may be inside or outside the image processing apparatus and connected to the image processing apparatus via a transmission path such as a network.
[0051]
The wavelet transform processing unit 102 captures still image data from the image source 50, and performs a two-dimensional wavelet transform (for example, the above-described 5 × 3 transform) on each of the non-overlapping rectangular areas (hereinafter, tiles) with respect to the still image. It is means for executing processing.
[0052]
The Yh, Yv calculation unit 105 fetches the coefficients of the HL and LH subbands output from the wavelet transform processing unit 102, and performs a process of calculating the high frequency component amounts Yh, Yv in the horizontal and vertical directions using the coefficients. Means.
[0053]
The moving object detection unit 110 is a unit that executes processing for detecting a moving object in a still image based on the high-frequency component amounts Yh and Yv in the horizontal and vertical directions. A tile and a smaller area can be selected as a unit area for moving object detection. First, an operation when a tile is selected as a unit area will be described with reference to a flowchart shown in FIG.
[0054]
One tile is selected (step S100), and two-dimensional wavelet transform of the tile is performed by the wavelet transform processing unit 102 (step S101).
[0055]
Using the HL and LH subband coefficients of the tile, the Yh and Yv calculation unit 105 calculates, for example, the high frequency component amount Yh in the horizontal direction by Expression (3) and the high frequency component amount Y in the vertical direction by Expression (4). Each is calculated (step S102).
Yh = ah ・ Σ | 1HL | + bh ・ Σ | 2HL | + ch ・ Σ | 3HL | (3)
Yv = av · Σ | 1LH | + bv · Σ | 2LH | + cv · Σ | 3LH | (4)
Here, ah, bh, ch, av, bv, and cv are constants of 0 or more.
[0056]
In this example, the HL and LH subband coefficients of the decomposition levels 1, 2 and 3 are used, but the HL and LH subband coefficients of lower levels (eg, only 1HL and 1LH subband coefficients) or more May be calculated using the HL and LH subband coefficients at the level of.
[0057]
The moving object detection unit 110 compares the smaller value of Yh and Yv, ie, Min (Yh, Yv), with the threshold value TH1 and makes a determination (step S104).
[0058]
If Min (Yh, Yv) is larger than the threshold value TH1 (step S104, Yes), it corresponds to the case of FIG. 10D, and it is determined that the tile of interest does not include a moving object, and the processing is performed. Goes to step S112.
[0059]
If Min (Yh, Yv) is equal to or smaller than the threshold TH1 (No at Step S104), the moving object detection unit 110 pays attention to the case of FIG. 10A, FIG. 10B, or FIG. It is determined that the moving object is included in the tile. Then, in order to determine the moving direction of the moving object, a ratio Max (Yh, Yv) / Min (Yh, Yv) between the larger value of Yh and Yv and the smaller value is calculated, and the ratio and the threshold value TH2 are calculated. Is determined (step S106). If the ratio is greater than TH2, it corresponds to the case of FIG. 10A or FIG. 10B. If Min (Yh, Yv) is Yh, it is determined that an object that moves in the horizontal direction is included. If Min (Yh, Yv) is Yv, it is determined that an object that moves in the vertical direction is included (step S108). If the ratio is equal to or less than TH2, it corresponds to the case of FIG. 10C, and it is determined that an object that moves in an oblique direction is included (step S110). The result of the above determination is temporarily stored in the moving object detection unit 110 for each tile.
[0060]
The processing from step S100 to step S110 is repeated for each tile. When the processing is completed up to the last tile (Step S112, Yes), the moving object detection unit 110 sets the connected tile group determined as including the moving object in the same moving direction as an area of one moving object. It is detected (step S114). For example, when a still image including an image of an automobile moving rightward as shown in FIG. 13A is processed by being divided into tiles as shown in FIG. As described above, it is determined that the tile indicated by the left and right arrows includes a moving object that moves in the horizontal direction, and the connected tile group (shaded area) is detected as one moving object area. You.
[0061]
An operation when a region smaller than a tile is selected as a unit region for moving object detection will be described with reference to the flowchart shown in FIG. As is clear from the description with reference to FIGS. 2 to 6, each coefficient of each sub-band of the wavelet transform has a corresponding relationship with a specific pixel (group) of the original image. Therefore, when each deinterleaved sub-band is divided into rectangular regions, each region corresponds to a specific rectangular region on the original image.
[0062]
First, one tile is selected (step S200), and two-dimensional wavelet transform is performed on the tile by the wavelet transform processing unit 102 (step S201).
[0063]
The Yh, Yv calculation unit 105 selects one region obtained by re-dividing the tile of interest (step S202), and uses the HL and LH subband coefficients of the region to determine the horizontal high-frequency component amount Yh and the vertical The high frequency component amounts Y in the directions are calculated (step S204). Equations (3) and (4) described above can be used as this calculation equation.
[0064]
The moving object detector 110 compares Min (Yh, Yv) with the threshold value TH1 (step S206).
[0065]
If Min (Yh, Yv) is greater than the threshold value TH1 (step S206, Yes), it is determined that the region of interest does not include a moving object, and the process proceeds to step S212.
[0066]
If Min (Yh, Yv) is equal to or smaller than the threshold value TH1 (No at Step S206), the moving object detection unit 110 determines that the region of interest includes a moving object. Then, Max (Yh, Yv) / Min (Yh, Yv) is calculated, and a comparison between this ratio and the threshold value TH2 is determined (step S208). If the ratio is greater than TH2, it is determined that an object that moves in the horizontal direction is included if Min (Yh, Yv) is Yh, and an object that moves in the vertical direction is included if Min (Yh, Yv) is Yv. It is determined that it has been performed (step S209). If the ratio is equal to or less than TH2, it is determined that an object that moves in an oblique direction is included (step S210). The result of the above determination is temporarily stored in the moving object detection unit 110 corresponding to the tile area.
[0067]
The processing from step S202 to step S210 is repeated for each area in the tile. When the processing is completed up to the last area of the selected tile (Step S212, Yes), the process returns to Step S200, and the same processing is repeated for the next tile.
[0068]
When the processing of the last tile is completed (step S214, Yes), the moving object detection unit 110 sets the connected area group of the areas determined to include the moving object in the same moving direction as one moving object area. It is detected (step S216).
[0069]
According to such moving object detection in units of regions smaller than tiles, it is possible to detect the region of moving objects more precisely than in the case of tile units. Moving object detection is also possible when performing a wavelet transform using a still image as one tile.
[0070]
The function of each unit of the image processing apparatus of this embodiment and the processing in the apparatus can be realized by hardware or firmware, or can be realized by software using a general-purpose computer such as a personal computer. Is also included in the present invention. Further, a recording (storage) medium on which such a program is recorded is also included in the present invention.
[0071]
As will be understood from the embodiments described below, the present invention can be suitably applied to an image processing method and apparatus using a compression method such as JPEG2000 in which compression processing includes two-dimensional wavelet transform.
[0072]
<< Embodiment 2 >> FIG. 15 is a block diagram for explaining another embodiment of the present invention. In this embodiment, a moving object is detected by using the HL and LH subband coefficients generated by the two-dimensional wavelet transform in the process of performing the compression processing of the still image by the algorithm of JPEG2000, and the moving object is detected. Encoded data in which the image quality of the area of the moving object is improved compared to other areas.
[0073]
In FIG. 15, a compression processing unit 100 is a unit that captures image data of a still image, such as a photographed image by a digital camera, from an image source (not shown), and performs a compression process using a JPEG2000 algorithm. The internal configuration of the compression processing unit 100 is not shown because it is the same as that shown in FIG.
[0074]
The Yh and Yv calculation unit 105 and the moving object detection unit 110 are the same as those in the first embodiment. However, the Yh and Yv calculation unit 105 uses the coefficients of the HL and LH subbands before quantization generated in the compression processing in the compression processing unit 100 for calculating the horizontal and vertical high frequency component amounts Yh and Yv. . When the still image to be compressed is a color image, only the HL and LH subband coefficients of the Y (luminance) component (or the G component in the case of the RGB component) are used for calculating the high frequency component amounts Yh and Yv.
[0075]
The moving object detection unit 110 detects a moving object in a still image based on the horizontal and vertical high frequency component amounts Yh and Yv. The detection result is also given to the compression processing unit 100, and the compression processing unit At 100, the compression conditions are controlled so that the image quality of the area of the moving object is improved over the other areas.
[0076]
As a unit area for detecting a moving object, a tile and a smaller area (for example, an area corresponding to a code block or a plurality of code blocks described with reference to FIG. 10 or an area corresponding to a precinct) can be selected. . The moving object detection operation when a tile is selected as a unit area is as shown in FIG. However, step S101 is executed by the wavelet transform processing means in the compression processing unit 100. The operation when the area smaller than the tile is selected as the unit area is as shown in FIG. 14, but step S201 is executed by the wavelet transform processing means in the compression processing unit 100.
[0077]
The determination result of step S104 (FIG. 12) or step S206 (FIG. 14) is sent from the moving object detection unit 110 to the compression processing unit 100. Therefore, when the purpose is only to control the compression processing, the processing in steps S106 to S110 and S114 (FIG. 12) or steps S208 to S210 and S216 (FIG. 14) can be omitted.
[0078]
The compression processing unit 100 performs a compression process on a tile or an area determined to include a moving object so as to have higher image quality than a tile or an area not including a moving object. JPEG2000 has a selective area image quality improvement function for improving the image quality of a region of interest (ROI; Region of Interest) compared to other regions. In the basic method of JPEG2000 (JPEG2000 Part 1), before encoding the wavelet coefficient, the wavelet coefficient value of the region of interest (ROI region) is shifted to the upper bits, and the wavelet coefficient value outside the region is shifted to the lower bits. Max Shift method is adopted. In JPEG2000, the image quality of the ROI area can also be improved by quantizing the wavelet coefficient value of the ROI area in a quantization step finer than other areas in the quantization step of the wavelet coefficient. The compression processing unit 100 executes the processing after the quantization stage by any one of the above methods so as to improve the image quality of the tile or the area determined to include the moving object. In this way, encoded data having a higher image quality in the area of the moving object than in the other areas is output from the compression processing unit 100. That is, according to this embodiment, the code amount of the encoded data can be reduced without lowering the image quality of the area of the moving object.
[0079]
The function of each unit of the image processing apparatus of the present embodiment or the processing in the apparatus can be realized by hardware or firmware, or can be realized by software using a general-purpose computer such as a personal computer. Is also included in the present invention. Further, a recording (storage) medium on which such a program is recorded is also included in the present invention.
[0080]
<< Embodiment 3 >> FIG. 16 is a block diagram for explaining another embodiment of the present invention. In this embodiment, similar to the second embodiment, the detection of a moving object is performed by using the HL and LH subband coefficients obtained by the two-dimensional wavelet transform in the process of executing the compression processing of the still image by the JPEG2000 algorithm. And converts the coded data generated by the compression process into coded data in which the image quality of the detected moving object area is improved compared to other areas.
[0081]
In this embodiment, a code processing unit 115 is added, and the detection result of the moving object detection unit 110 is provided to the code processing unit 115. The image data of the still image is subjected to lossless compression or lossy compression at a low compression ratio close to lossless by the compression processing unit 100, and the obtained encoded data is input to the code processing unit 115.
[0082]
JPEG2000 encoded data can be processed such as discarding the code in the code state. The code processing unit 115 processes the encoded data generated by the compression processing unit 100 according to the moving object detection result, thereby reducing the entire code amount without deteriorating the image quality of the tile or the area including the moving object. Is converted to encoded data. Except for this point, the second embodiment is the same as the second embodiment, and a description thereof is omitted.
[0083]
Note that a storage unit may be interposed between the compression processing unit 100 and the code processing unit 115. In addition, a mode in which the code processing unit 115 is connected to other parts via a network or the like is also possible. Such an embodiment is also included in the present invention.
[0084]
The function of each unit of the image processing apparatus of the present embodiment or the processing in the apparatus can be realized by hardware or firmware, or can be realized by software using a general-purpose computer such as a personal computer. Is also included in the present invention. Further, a recording (storage) medium on which such a program is recorded is also included in the present invention.
[0085]
<< Embodiment 4 >> FIG. 17 is a block diagram for explaining another embodiment of the present invention. In this embodiment, a moving object is detected by using HL and LH subband coefficients obtained by a process of expanding JPEG2000 encoded data.
[0086]
In FIG. 17, a decompression processing unit 200 is means for fetching encoded data of a still image compressed by a JPEG2000 algorithm from an image source (not shown) and performing decompression processing. The internal configuration of the decompression processing unit 200 is not shown because it is the same as that shown in FIG.
[0087]
The Yh and Yv calculation unit 105 and the moving object detection unit 110 are the same as those in the first embodiment. However, the Yh, Yv calculation unit 105 uses the dequantized HL, LH subband coefficients generated by the decompression processing by the decompression processing unit 200 to calculate the horizontal and vertical high frequency component amounts Yh, Yv. When the still image is a color image, only the HL and LH subband coefficients of the Y component (or the G component in the case of the RGB component) are used for calculating the high frequency component amounts Yh and Yv.
[0088]
The moving object detection unit 110 detects a moving object in a still image based on the horizontal and vertical high-frequency component amounts Yh and Yv. As in the second and third embodiments, a tile and a smaller area (for example, an area corresponding to a code block or a plurality of code blocks described with reference to FIG. 10 or a precinct) are used as unit areas for moving object detection. Corresponding area). The moving object detection operation when a tile is selected as the unit area or when an area smaller than the tile is selected is as shown in FIG. 12 or FIG. 14, respectively. Is replaced with the process of generating subband coefficients by entropy decoding and inverse quantization.
[0089]
The functions of each unit of the image processing apparatus according to the present embodiment and the processing in the apparatus can be realized by hardware or firmware, or can be realized by software using a general-purpose computer such as a personal computer. Embodiments are also included in the present invention. Further, a recording (storage) medium on which such a program is recorded is also included in the present invention.
[0090]
<< Embodiment 5 >> FIG. 18 is a block diagram for explaining another embodiment of the present invention. In this embodiment, a moving object is detected using the HL and LH subband coefficients generated by the decompression process of JPEG2000 coded data, and the coded data is used to increase the area of the detected moving object. The image quality is converted to coded data in which the area other than the image quality is reduced.
[0091]
In this embodiment, a code processing unit 115 similar to that of the third embodiment is added, and the other parts are the same as those of the fourth embodiment.
[0092]
The decompression processing unit 200 takes in encoded data of a still image that has been losslessly compressed or nearly losslessly compressed by an algorithm of PEG2000 from an image source (not shown) and performs decompression processing. Using the HL and LH subband coefficients generated by the decompression processing, the Yh and Yv calculation units 105 calculate the horizontal and vertical high frequency component amounts Yh and Yv, and the moving object detection unit 110 uses them. An area of the moving object is detected. The moving object detection processing operation is the same as in the fourth embodiment. The moving object detection result for each tile or each area smaller than that (the result of step S104 in FIG. 12 or step S206 in FIG. 14) is provided to the code processing unit 115, and the code processing unit 115 performs the same processing as in the third embodiment. Then, the encoded data is processed so that the tiles or regions determined to include the moving object have high image quality and the other regions have low image quality.
[0093]
When only the processing of the encoded data is intended, the processing of steps S106 to S110 and S114 in FIG. 12 or the processing of steps S208 to S210 and S216 in FIG. 14 can be omitted.
[0094]
The function of each unit of the image processing apparatus of this embodiment and the processing in the apparatus can be realized by hardware or firmware, or can be realized by software using a general-purpose computer such as a personal computer. Is also included in the present invention. Further, a recording (storage) medium on which such a program is recorded is also included in the present invention.
[0095]
<< Embodiment 6 >> FIG. 19 is a block diagram for explaining another embodiment of the present invention. The image processing device according to the present embodiment is an imaging device, more specifically, an electronic camera device such as a digital camera.
[0096]
In FIG. 19, reference numeral 300 denotes a general imaging optical system including an optical lens, an aperture mechanism, a shutter mechanism, and the like. Reference numeral 301 denotes a CCD type or MOS type imager, which separates an optical image formed by the imaging optical system 300 into colors and converts the color into an electric signal corresponding to the amount of light. Reference numeral 302 denotes a CDS / A / D conversion unit that samples the output signal of the imager 301 and converts it into a digital signal, and includes a correlated double sampling (CDS) circuit and an A / D conversion circuit.
[0097]
An image processor 303 is, for example, a high-speed digital signal processor controlled by a program (microcode). The image processor 303 performs signal processing such as gamma correction processing, white balance adjustment processing, enhancement processing for edge enhancement, and the like on image data input from the CDS / A / D conversion unit 302, as well as the imager 301, CDS / It controls the A / D conversion unit 302 and the display unit 304, and detects information for auto focus control, automatic exposure control, white balance adjustment, and the like. The display unit 304 is, for example, a liquid crystal display device, and is used for displaying images such as monitoring images (through images) and captured images, and for displaying other information.
[0098]
The above-described imaging optical system 300, imager 301, CDS / A / D conversion unit 302, and image processor 303 constitute an imaging unit that captures a still image.
[0099]
Reference numeral 308 denotes a compression / decompression processing unit that performs a compression process on image data and a decompression process on encoded data according to a JPEG2000 algorithm. Reference numeral 312 denotes a medium recording unit that writes / reads information to / from the recording (storage) medium 313. The recording medium 313 is, for example, various memory cards. 314 is an interface unit. The image processing apparatus can exchange information with an external personal computer or the like through a wired or wireless transmission path or a network via the interface unit 314.
[0100]
Reference numeral 306 denotes a system controller composed of a microcomputer. The system controller 306 responds to user operation information input from the operation unit 307 or information given from the image processor 303, and operates the image processor 303 of the shutter mechanism, aperture mechanism, and zooming mechanism of the imaging optical system 300. The compression / decompression processing unit 308 and the medium recording unit 312 are controlled. Reference numeral 305 denotes a memory, which is used as a temporary storage area for image data and coded data thereof, and as a work storage area for the image processor 303, the system controller 306, the compression / decompression processing unit 308, and the medium recording unit 312.
[0101]
The operation unit 307 includes, in addition to general operation buttons (switches) for operating the electronic camera device, operation buttons for inputting an instruction related to detection of a moving object.
[0102]
The normal shooting operation is as follows. When the release button of the operation unit 307 is pressed, a shooting instruction is given from the system controller 306 to the image processor 303, and the image processor 303 drives the imager 301 under the condition of still image shooting. Data of the photographed still image is stored in the memory 305 via the image processor 303. The image data is compressed by the compression / decompression processing unit 308 at a pre-designated or default compression ratio under the control of the system controller 306, and the encoded data is recorded on the recording medium 313 by the medium recording unit 312. You.
[0103]
An operation mode corresponding to the first embodiment (FIG. 11) can also be designated. In this operation mode, the captured image data is compressed by the compression / decompression processing unit 308, and the moving object detection process using the HL and LH subband coefficients generated in the compression process is executed by the system controller 306. You. That is, the two-dimensional wavelet transform function of the compression / decompression processor 308 is used as the wavelet transform processor 102 in FIG. 11, and the functions of the Yh / Yv calculator 105 and the moving object detector 110 in FIG. The above is realized by a program. The detection result of the moving object is added to the encoded data (for example, described as a comment in a main header or a tile header) and recorded on the recording medium 313 together with the encoded data.
[0104]
It is also possible to specify an operation mode corresponding to the second embodiment (FIG. 15). In this operation mode, the captured image data is compressed by the compression / decompression processing unit 308. The moving object detection processing using the HL and LH subband coefficients generated in the course of the compression processing is performed by the system controller 306. Executed in That is, the functions of the Yh, Yv calculation unit 105 and the moving object detection unit 110 in FIG. 15 are realized by a program on the system controller 306. Then, in accordance with the moving object detection result, the compression / expansion processing unit 308 performs a compression process for improving the image quality of a tile or a smaller area determined to include the moving object. The generated encoded data is recorded on the recording medium 313.
[0105]
An operation mode corresponding to the third embodiment (FIG. 16) can also be designated. In this operation mode, the photographed image data is recorded in the memory 305 or the recording medium 313 compressed by the compression / decompression processing unit 308. A moving object detection process using the HL and LH subband coefficients generated in the compression process is executed by the system controller 306, and the detection result is also recorded in the memory 305. Thereafter, code processing is performed on the coded data recorded in the memory 305 or the recording medium 313 in accordance with the result of detection of the moving object, and the coding amount is reduced without reducing the image quality of the area of the moving object. Data is generated, and the encoded data is recorded on the recording medium 313. That is, the function of the code processing unit 115 in FIG. 16 is realized by a program on the system controller 306.
[0106]
An operation mode corresponding to the fourth embodiment (FIG. 17) can also be designated. In this operation mode, the encoded data of the designated image is read from the recording medium 313, and the encoded data is decompressed by the compression / decompression processing unit 308. The system controller 306 executes a moving object detection process using the HL and LH subband coefficients generated in the process of the decompression process. The moving object detection result is displayed on the display unit 304 together with the decompressed image data, for example, or added to the original coded data (for example, described as a comment in the main header or the tile header), and together with the coded data. It is recorded on a recording medium.
[0107]
An operation mode corresponding to the fifth embodiment (FIG. 18) can be designated. When this operation mode is designated, the encoded data of the designated image is read from the recording medium 313 to the memory 305 by the medium recording unit 312. The encoded data is decompressed by the compression / decompression processing unit 308. The moving object detection process using the HL and LH subband coefficients generated in the process and the processing of the encoded data according to the result are performed by the system. This is executed by the controller 306. That is, the functions of the Yh and Yv calculation unit 105, the moving object detection unit 110, and the code processing unit 115 in FIG. 18 are realized by a program on the system controller 306. Encoded data after code processing is generated in the memory 305, and is recorded on the recording medium 313 by the medium recording unit 312.
[0108]
The compression / decompression processing unit 308 may be realized by a program on the image processor 303, for example. Further, hardware or firmware corresponding to the Yh, Yv calculation unit 105, the moving object detection unit 110, and the code processing unit 115 can be separately provided.
[0109]
The compression method in the second to sixth embodiments is not necessarily limited to JPEG2000, and other compression methods including a two-dimensional wavelet transform in the compression process may be used.
[0110]
【The invention's effect】
As is apparent from the above description, according to the present invention, it is possible to detect the region and moving direction of a moving object in a still image captured by a digital camera or the like without referring to temporally preceding and succeeding images. Can be. It is possible to detect the moving object and the moving direction in the compressed still image. The code amount of encoded data of a still image can be reduced without lowering the image quality of a moving object in the still image. When the compression process involves a compression process such as JPEG2000 that includes a two-dimensional wavelet transform or a process for expanding the encoded data, the HL and LH subband coefficients generated by the compression process or the expansion process are moved. By utilizing the present invention for object detection, it is possible to simplify the configuration of the apparatus, and so on.
[Brief description of the drawings]
FIG. 1 is a simplified block diagram for explaining an algorithm of JPEG2000.
FIG. 2 is a diagram showing a format of JPEG2000 encoded data.
FIG. 3 is a diagram illustrating an example of a tile image.
FIG. 4 is a diagram illustrating a result of performing a wavelet transform in the vertical direction on the tile image of FIG. 3;
FIG. 5 is a diagram illustrating a result of performing a horizontal wavelet transform on the coefficient sequence of FIG. 4;
FIG. 6 is a diagram in which the coefficient sequence of FIG. 5 is deinterleaved.
FIG. 7 is a diagram illustrating a result of performing a two-dimensional wavelet transform on the LL subband in FIG. 6;
FIG. 8 is a diagram illustrating subband coefficients of each level when a three-level two-dimensional wavelet transform is performed.
FIG. 9 is a diagram for explaining precincts and code blocks.
FIG. 10 is a diagram illustrating a relationship between a high-frequency component amount in a horizontal direction and a vertical direction and a movement direction.
FIG. 11 is a block diagram for explaining Embodiment 1 of the present invention.
FIG. 12 is a flowchart illustrating a moving object detection process in tile units.
FIG. 13 is a diagram illustrating an example of moving object detection in tile units.
FIG. 14 is a flowchart for explaining moving object detection processing in units of areas smaller than a tile;
FIG. 15 is a block diagram for explaining Embodiment 2 of the present invention.
FIG. 16 is a block diagram for explaining Embodiment 3 of the present invention.
FIG. 17 is a block diagram for explaining Embodiment 4 of the present invention.
FIG. 18 is a block diagram for explaining Embodiment 5 of the present invention.
FIG. 19 is a block diagram for explaining Embodiment 6 of the present invention.
[Explanation of symbols]
100 Compression unit
102 Wavelet transform processing unit
105 Yh, Yv calculation unit
110 Moving object detector
115 Code processing unit
200 Decompression processing unit
300 Imaging optical system
301 Imager
302 CDS / A / D converter
303 Image Processor
306 System controller
308 compression / decompression processing unit

Claims

A process of calculating the horizontal and vertical high-frequency component amounts for each region of the still image from the HL subband coefficients and the LH subband coefficients obtained by applying the two-dimensional wavelet transform to the still image;
Detecting a moving object in a still image based on the high-frequency component amount.

The method further includes a still image compression process including a two-dimensional wavelet transform, wherein the HL subband coefficient and the LH subband coefficient generated by the two-dimensional wavelet transform in the compression process are used for calculating a high-frequency component amount. The image processing method according to claim 1.

3. The image processing method according to claim 2, wherein in the compression processing, the moving object area generates encoded data with higher image quality than other areas according to a result of detecting the moving object.

Processing the encoded data generated by the compression process in accordance with the result of detection of the moving object, and further including a process of converting the encoded data into encoded data in which the area of the moving object is higher in quality than other areas. 3. The image processing method according to claim 2, wherein:

The method further includes decompression processing of the encoded data of the still image compressed by the compression processing including the two-dimensional wavelet transform, and the HL subband coefficient and the LH subband coefficient generated by the decompression processing are used for calculating the high frequency component amount. The image processing method according to claim 1, wherein:

The method according to claim 5, further comprising processing the encoded data in accordance with a detection result of a moving object, and converting the encoded data into encoded data in which the area of the moving object is higher in quality than other areas. Image processing method.

The image processing method according to any one of claims 1 to 6, wherein a high-frequency component amount is calculated for each of the same regions as the regions to which the two-dimensional wavelet transform is applied.

7. The image processing method according to claim 1, wherein a high-frequency component amount is calculated for each region smaller than a region to which the two-dimensional wavelet transform is applied.

9. The image processing method according to claim 7, wherein, in the moving object detection processing, the moving object includes an area in which the smaller high-frequency component amount of the horizontal and vertical high-frequency component amounts is equal to or less than a predetermined threshold. An image processing method characterized in that the image is determined to be a region to be processed.

10. The image processing method according to claim 9, wherein in the moving object detection processing, the moving direction of the moving object is determined based on a magnitude relationship between the high frequency component amounts in the horizontal direction and the vertical direction.

Wavelet transform processing means for performing two-dimensional wavelet transform on a still image;
High frequency component amount calculating means for calculating the horizontal and vertical high frequency component amounts for each region of the still image from the HL subband coefficients and the LH subband coefficients generated by the two-dimensional wavelet transform;
An image processing apparatus comprising: a moving object detection unit configured to detect a moving object in a still image based on the high frequency component amount.

The image processing apparatus according to claim 11, further comprising compression processing means for performing a compression processing of a still image, wherein the wavelet transformation processing means is included in the compression processing means.

13. The image processing apparatus according to claim 12, wherein the compression processing unit generates encoded data having a higher image quality in a region of the moving object than in other regions in accordance with a detection result of the moving object by the moving object detection unit. Characteristic image processing device.

Code processing for processing the encoded data generated by the compression processing in accordance with the result of detection of a moving object by the moving object detection means, and converting the encoded data into code data in which the area of the moving object is higher in quality than other areas. The image processing apparatus according to claim 12, further comprising a unit.

Decompression processing means for performing decompression processing on encoded data of a still image compressed by compression processing including two-dimensional wavelet transform;
High frequency component amount calculating means for calculating the horizontal and vertical high frequency component amounts for each still image region from the HL subband coefficient and the LH subband coefficient generated by the decompression process;
An image processing apparatus comprising: a moving object detection unit configured to detect a moving object in a still image based on the high frequency component amount.

It has code processing means for processing the coded data in accordance with the result of detection of the moving object by the moving object detection means, and converting the coded data into code data in which the area of the moving object is higher in quality than other areas. The image processing apparatus according to claim 15, wherein

17. The image processing apparatus according to claim 11, wherein the high-frequency component amount calculation unit calculates a high-frequency component amount for each of the same regions as the regions to which the two-dimensional wavelet transform is applied. Image processing device.

17. The image processing apparatus according to claim 11, wherein the high-frequency component amount detection means calculates a high-frequency component amount for each region smaller than a region to which the two-dimensional wavelet transform is applied. Image processing device.

An imaging apparatus comprising: the respective units according to claim 11; and an imaging unit configured to capture a still image.

A program for causing a computer to execute each processing according to claim 1.

A computer-readable recording medium on which the program according to claim 20 is recorded.