JP2004064702A

JP2004064702A - Video data compressing apparatus, method and program

Info

Publication number: JP2004064702A
Application number: JP2002224068A
Authority: JP
Inventors: Nagahito Narita; 成田　長人
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2002-07-31
Filing date: 2002-07-31
Publication date: 2004-02-26
Anticipated expiration: 2022-07-31
Also published as: JP3792623B2

Abstract

<P>PROBLEM TO BE SOLVED: To provide a video data compressing apparatus, method and program in which picture quality deterioration in a concerned region projecting an object or the like to pay attention in video data is reduced rather than a background region of the other region when the video data are compressed, and compressibility of the video data is improved. <P>SOLUTION: A video data compressing apparatus 1 is provided with a region division control means 10 for performing division into regions by judging whether inputted video data belong to the moving concerned region or the other background region for the unit of a video frame for each block of a specific size, and a gradation reduction control means 20 for individually reducing gradation indicating the number of colors expressible with pixels in the concerned region and the background region. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明は、映像データの圧縮技術に関し、より詳細には、映像データ内の注目領域の画質劣化を軽減して映像データを圧縮する映像データ圧縮装置、その方法及びそのプログラムに関する。
【０００２】
【従来の技術】
現在、携帯端末やＰＤＡ（Ｐｅｒｓｏｎａｌ　Ｄｉｇｉｔａｌ　Ａｓｓｉｓｔａｎｔ）のような情報端末に対して、映像データを配信するサービスが普及し始めている。この場合、高精細な映像データを配信しようとしても、映像データを配信するための伝送路の帯域に制限があるため、映像データは、ＭＰＥＧ−４（Ｍｏｖｉｎｇ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ　４）等により帯域を圧縮して配信されている。
【０００３】
従来、この映像データの圧縮技術に関しては、例えば、ＭＰＥＧ−２（Ｍｏｖｉｎｇ　Ｐｉｃｔｕｒｅ　Ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ　２）にように、動き補償フレーム間予測（ＭＣ：Ｍｏｔｉｏｎ　Ｃｏｍｐｅｎｓａｔｉｏｎ）と、離散コサイン変換（ＤＣＴ：Ｄｉｓｃｒｅｔｅ　Ｃｏｓｉｎｅ　Ｔｒａｎｓｆｏｒｍ）とを組み合わせた技術が一般的に用いられている。
【０００４】
すなわち、ＭＰＥＧ−２等による映像データの圧縮は、まず、動き補償フレーム間予測によって、映像データの映像フレーム間の予測誤差を１６×１６画素（マクロブロック）単位に生成し、その予測誤差を離散コサイン変換することで、周波数成分の振幅を示すＤＣＴ係数を生成する。そして、高周波成分に対する人の視覚感度が弱いことを利用して、高周波成分のＤＣＴ係数の桁数を多く削減することで、映像データの圧縮を行っている。
【０００５】
【発明が解決しようとする課題】
しかし、前記従来の技術において、ＭＰＥＧ−２等の映像データの圧縮は、動き補償フレーム間予測における動きベクトルの大きさや、マクロブロックの周波数成分に依存して情報量の削減を行っており、映像データの内容を考慮したものではなかった。
【０００６】
このため、携帯端末やＰＤＡのような小型の携帯端末に高精細な映像データを配信しようとすると、帯域圧縮によって多くの情報量が削減され、その映像データを配信された携帯端末で表示する表示映像は、画面全体に画質が劣化した映像となってしまう。すなわち、携帯端末で表示される表示映像は、その映像内における注目すべき被写体等が映された注目領域が、それ以外の領域である背景領域と同程度に画質劣化してしまうという問題があった。
【０００７】
本発明は、以上のような問題点に鑑みてなされたものであり、映像データを圧縮したときに、その映像データにおける注目すべき被写体等が映された注目領域の画質劣化を、それ以外の領域である背景領域よりも軽減させるとともに、映像データの圧縮率を高めることを可能にした映像データ圧縮装置、その方法及びそのプログラムを提供することを目的とする。
【０００８】
【課題を解決するための手段】
本発明は、前記目的を達成するために創案されたものであり、まず、請求項１に記載の映像データ圧縮装置は、入力された映像データについて、各画素の表現できる色数を示す階調を削減することで、前記映像データの圧縮を行う映像データ圧縮装置であって、前記映像データを、映像フレーム毎に被写体が存在する注目領域とそれ以外の背景領域とに分割する領域分割制御手段と、この領域分割制御手段によって分割された前記注目領域及び前記背景領域毎に、前記階調を個別に削減する階調削減制御手段と、を備える構成とした。
【０００９】
かかる構成によれば、映像データ圧縮装置は、領域分割制御手段によって、映像データを映像フレーム毎に注目領域とその注目領域以外の背景領域とに分割する。このとき、注目領域が動きのある領域である場合は、その注目領域の動きベクトルを求めることで、注目領域と背景領域とを識別して、領域の分割を行う。また、注目領域が背景領域と比較して、色等の特徴によって識別が可能な場合は、その色等の特徴によって閾値処理を行うことで領域を分割することとしてもよい。
【００１０】
そして、映像データ圧縮装置は、階調削減制御手段によって、領域分割制御手段で分割された領域毎に、その領域の階調を個別に削減する。例えば、映像データを削減することで映像データの圧縮を行う場合、背景領域の階調を注目領域の階調よりも多く削減することで、注目領域の画質の劣化を背景領域の画質の劣化に比べて抑えるように作用する。
【００１１】
また、請求項２に記載の映像データ圧縮装置は、請求項１に記載の映像データ圧縮装置において、前記領域分割制御手段は、前記映像データの連続した映像フレーム間で特定の大きさのブロック毎に動きベクトルを算出する動きベクトル算出手段と、この動きベクトル算出手段によって算出された動きベクトルに基づいて、前記映像フレーム内における前記ブロックが、前記注目領域に含まれるブロックか、それ以外の前記背景領域に含まれるブロックかを識別する領域識別手段と、を備える構成とした。
【００１２】
かかる構成によれば、映像データ圧縮装置は、動きベクトル算出手段によって、映像データの連続した映像フレーム間で特定の大きさのブロック毎に動きベクトルを算出する。例えば、このブロックは、ＭＰＥＧ−２等の動き補償予測に使用されるマクロブロックとする。そして、領域識別手段によって、動きベクトル算出手段で算出した動きベクトルの大きさに基づいて、映像フレーム内におけるブロック（マクロブロック）が、動きのある注目領域に含まれるブロックか、それ以外の背景領域に含まれるブロックかを識別する。これによって、映像フレームをブロック（マクロブロック）単位で注目領域と背景領域とに分割し、注目領域と背景領域との映像データを独立して加工（削減）することが可能になる。
【００１３】
さらに、請求項３に記載の映像データ圧縮装置は、請求項２に記載に映像データ圧縮装置において、前記領域分割制御手段が、前記領域識別手段で識別された前記注目領域と前記背景領域とが隣接する領域において、前記注目領域のブロックと前記背景領域のブロックとの相関に基づいて、前記背景領域のブロックを前記注目領域のブロックとして拡張させる領域拡張手段を備える構成とした。
【００１４】
かかる構成によれば、映像データ圧縮装置は、領域拡張手段によって、注目領域と背景領域とが隣接する領域において、隣接するブロック（マクロブロック）の相関、例えば、輝度や色等の特徴量で類似性のある背景領域のブロックを注目領域のブロックとする。これによって、動きベクトルの検出では注目領域と認識されない動きの少ない領域を注目領域として拡張することが可能になる。
【００１５】
また、請求項４に記載の映像データ圧縮装置は、請求項１乃至請求項３のいずれか１項に記載の映像データ装置において、前記階調削減制御手段が、前記背景領域に割り当てられている階調を、前記注目領域に割り当てられている階調よりも多く削減することを特徴とする。
【００１６】
かかる構成によれば、映像データ圧縮装置は、階調削減制御手段によって、背景領域に割り当てる階調を、注目領域に割り当てる階調よりも低くする。これによって、映像データを圧縮する際に、注目領域の圧縮率を高めることが可能になる。
【００１７】
さらに、請求項５に記載の映像データ圧縮装置は、請求項１乃至請求項４のいずれか１項に記載の映像データ装置において、前記階調削減制御手段が、前記注目領域及び前記背景領域毎に色差成分に割り当てられている階調を、輝度成分に割り当てられている階調よりも多く削減することを特徴とする。
【００１８】
かかる構成によれば、映像データ圧縮装置は、階調削減制御手段によって、注目領域及び背景領域毎に色差成分に割り当てる階調を、輝度成分に割り当てる階調よりも低くする。これによって、映像データを圧縮する際に、人間の視覚が輝度成分に比べて色差成分の感度が低いため、映像データの画質の劣化を抑えたままで圧縮効率を高めることが可能になる。
【００１９】
また、請求項６に記載の映像データ圧縮方法は、入力された映像データについて、各画素の表現できる色数を示す階調を削減することで、前記映像データの圧縮を行う映像データ圧縮方法であって、前記映像データを、映像フレーム毎に被写体が存在する注目領域とそれ以外の背景領域とに分割する領域分割ステップと、この領域分割ステップで分割された前記注目領域及び前記背景領域毎に、前記階調を個別に削減する階調削減ステップとを含み、前記階調削減ステップが、前記背景領域に割り当てられている階調を、前記注目領域に割り当てられている階調よりも多く削減することを特徴とする。
【００２０】
この方法によれば、映像データ圧縮方法は、領域分割ステップによって、映像データを映像フレーム毎に注目領域とその注目領域以外の背景領域とに分割する。このとき、注目領域が動きのある領域である場合は、その注目領域の動きベクトルを求めることで、注目領域と背景領域とを識別して、領域の分割を行う。また、注目領域が背景領域と比較して、色等の特徴によって識別が可能な場合は、その色等の特徴によって閾値処理を行うことで領域を分割することとしてもよい。
【００２１】
そして、映像データ圧縮方法は、階調削減ステップによって、領域分割ステップで分割された領域毎に、その領域の階調を個別に削減する。このとき、背景領域の階調を注目領域の階調よりも多く削減することで、注目領域の画質劣化を背景領域の画質の劣化に比べて抑えるように作用する。
【００２２】
さらに、請求項７に記載の映像データ圧縮プログラムは、入力された映像データについて、各画素の表現できる色数を示す階調を削減して、前記映像データの圧縮を行うために、コンピュータを、前記映像データを、映像フレーム毎に被写体が存在する注目領域とそれ以外の背景領域とに分割する領域分割制御手段、この領域分割制御手段によって分割された前記注目領域及び前記背景領域毎に、前記階調を個別に削減する階調削減制御手段として機能させ、前記階調削減制御手段が、前記背景領域に割り当てられている階調を、前記注目領域に割り当てられている階調よりも多く削減することを特徴とする。
【００２３】
かかる構成によれば、映像データ圧縮プログラムは、領域分割制御手段によって、映像データを映像フレーム毎に注目領域とその注目領域以外の背景領域とに分割し、階調削減制御手段によって、領域分割制御手段で分割された領域毎に、その領域の階調を個別に削減する。このとき、背景領域の階調を注目領域の階調よりも多く削減することで、注目領域の画質の劣化を背景領域の画質の劣化に比べて抑えるように作用する。
【００２４】
【発明の実施の形態】
以下、本発明の実施の形態について図面を参照して説明する。
（映像データ圧縮装置の構成：第一の実施の形態）
図１は、本発明における第一の実施の形態である映像データ圧縮装置１の構成を示したブロック図である。図１に示した映像データ圧縮装置１は、入力された映像データを、動きのある注目領域とそれ以外の領域である背景領域とに分割し、その分割された領域毎に画素値を表現するための階調を削減することで、映像データを圧縮して出力するものであり、領域分割制御手段１０と階調削減制御手段２０とを備える構成とした。
【００２５】
なお、ここで注目領域とは、図７に示したような映像フレームＦ上に登場する人物等の動きのある領域（注目領域ＦＧ）を指し、背景領域は注目領域ＦＧ以外の領域（背景領域ＢＧ）を指す。なお、注目領域ＦＧは図７に示すように映像フレームＦ上に複数存在していてもよい。
【００２６】
領域分割制御手段１０は、入力された映像データを、映像フレーム単位で動きのある注目領域か、あるいは、それ以外の背景領域かを、特定の大きさのブロック毎に判定することで領域の分割を行うものである。ここでは、この領域分割制御手段１０は、動きベクトル算出部１１と、グローバルベクトル算出部１２と、領域識別部１３とを備えるものとした。
【００２７】
また、階調削減制御手段２０は、入力された映像データの注目領域及び背景領域の画素値を表現するための階調を、個別に削減するものである。この階調削減制御手段２０は、階調設定部２１と、階調削減部２２とを備えるものとした。
なお、ここで特定の大きさのブロックは、ＭＰＥＧ−２等の動き補償予測に使用されるマクロブロック（１６×１６画素）とする。
【００２８】
動きベクトル算出部（動きベクトル算出手段）１１は、連続して入力される映像データの映像フレームから、動きベクトルを算出するものである。ここでは、動きベクトル算出部１１を映像遅延部１１ａと動きベクトル検出部１１ｂとで構成した。
【００２９】
映像遅延部１１ａは、入力された映像データ（入力映像データ）を映像フレーム単位で遅延させるものである。この映像遅延部１１ａで１映像フレーム分遅延された映像データ（遅延映像データ）は、動きベクトル検出部１１ｂへ出力される。
【００３０】
動きベクトル検出部１１ｂは、入力された映像データ（入力映像データ）と、映像遅延部１１ａで遅延された遅延映像データとに基づいて、映像フレームのマクロブロック単位で動きベクトルを検出するものである。この動きベクトル検出部１１ｂで検出した動きベクトルは、グローバルベクトル算出部１２及び領域識別部１３へ出力される。なお、この動きベクトルは、入力映像データの映像フレームと、遅延映像データの映像フレームとの間、すなわち隣接映像フレーム間で、マクロブロック毎にブロックマッチング法によって求められる。
【００３１】
グローバルベクトル算出部１２は、動きベクトル検出部１１ｂから入力されたマクロブロック毎の動きベクトルに基づいて、その複数の動きベクトルの中で、最も多く検出された動きベクトルをグローバルベクトルとして算出するものである。このグローバルベクトル算出部１２で算出されたグローバルベクトルは、領域識別部１３へ出力される。なお、ここで算出されたグローバルベクトルは、入力された映像データの中で、最も領域の大きい背景領域の動きベクトルとみなすことができる。
【００３２】
領域識別部（領域識別手段）１３は、動きベクトル検出部１１ｂで検出されたマクロブロック毎の動きベクトルと、グローバルベクトル算出部１２で算出されたグローバルベクトルとに基づいて、マクロブロックが動きのある注目領域に含まれるものか、それ以外の領域である背景領域に含まれるものかを識別するものである。この領域識別部１３で識別されたマクロブロック毎の領域（注目領域又は背景領域）は、マクロブロックの座標とともに領域情報として階調削減制御手段２０の階調設定部２１へ出力される。
【００３３】
ここでは、グローバルベクトルとは異なる動きをする映像（マクロブロック）を注目領域とみなす。例えば、各マクロブロックの動きベクトルとグローバルベクトルとを比較して、両ベクトルの差が予め設定した値（例えば、映像フレーム当たり４画素）以上の場合に、そのマクロブロックが注目領域に含まれるものと判断する。
【００３４】
階調設定部２１は、領域分割制御手段１０の領域識別部１３で識別された注目領域及び背景領域の各マクロブロック毎に階調の削減量を設定し、マクロブロックの座標とともに削減情報として階調削減部２２へ出力するものである。ここでは、映像（映像データ）を伝送するための伝送路のＣ／Ｎ（Ｃａｒｒｉｅｒ　ｔｏ　Ｎｏｉｓｅ　Ｒａｔｉｏ）情報や映像を蓄積するネットワークサーバのバッファ占有情報等に基づいて、予め映像データの削減量（映像削減量）を求めておき、キーボード等の入力手段（図示せず）から、その映像削減量を階調設定部２１に入力するものとする。そして、階調設定部２１では、その映像削減量に基づいて、背景領域の階調が注目領域の階調よりも低くなるように各領域の階調削減量を設定する。
【００３５】
ここで、図３及び図４を参照して、階調設定部２１における注目領域及び背景領域の階調削減量の設定方法について説明する。ここでは、映像データをＹＣ（輝度／色差）映像信号とし、その階調が８ビットで表現されているものとする。図３は、ＹＣ（輝度／色差）映像信号の例として、ＭＰＥＧ−２におけるマクロブロックの構成を示したものである。図４は、階調を削減する削除内容の優先順位を示したものである。
【００３６】
図３に示したように、ＭＰＥＧ−２では、マクロブロックは１６×１６画素のＹ（輝度）映像信号と、８×８画素のＣ（色差）映像信号（Ｃ_ｒ映像信号及びＣ_ｂ映像信号）で構成されている。ここで階調を削減するとは、各画素を示すビット数そのものを削減して、その画素が表現できるレベルを少なくすることである。例えば、８ビットで２５６階調の映像を表現可能な元の画素Ｂ１から３ビット削減することで、削減後の画素Ｂ２は５ビットで３２階調までしか表現することができない。
【００３７】
そして、図４に示したように、階調設定部２１（図１）は優先順位（１）〜（８）の順番で、階調削減量を各マクロブロックに設定する。
優先順位（１）では、背景領域のＣ（色差）映像信号の階調を削減するように設定し、優先順位（２）では、背景領域のＹ（輝度）映像信号の階調を削減するように設定する。そして、優先順位（３）では、注目領域のＣ（色差）映像信号の階調を削減するように設定し、優先順位（４）では、注目領域のＹ（輝度）映像信号の階調を削減するように設定する。なお、ここまでの削減では、階調が最小で５ビットになるまで削減できるものとする。
【００３８】
ここで、Ｃ（色差）映像信号の削減をＹ（輝度）映像信号の削減よりも優先したのは、人間の視覚が輝度成分に比べて色差成分の感度が低いという特徴を有しているからである。また、ここで最小階調を５ビットとしたのは、階調８ビットの原画映像に対して階調を４ビット以下に削減すると画質が著しく劣化することが報告されていることによる（参考文献：大塚　他，“時間・空間・階調解像度とＴＶ画質”，電子情報通信学会画像工学研究会，ＩＥ８７−１１４，ｐｐ．１７−２４，１９８７）。
【００３９】
そして、さらに階調の削減を要する場合は、優先順位（５）として、背景領域のＣ（色差）映像信号の階調を５ビット未満（最小０ビットまで）に削減し、優先順位（６）として、注目領域のＣ（色差）映像信号の階調を５ビット未満（最小０ビットまで）に削減する。また、優先順位（７）では、背景領域のＹ（輝度）映像信号の階調を５ビット未満（最小０ビットまで）に削減し、優先順位（８）では、注目領域のＹ（輝度）映像信号の階調を５ビット未満（最小０ビットまで）に削減する。
なお、優先順位（６）及び優先順位（７）はその優先順位を逆にすることとしてもよい。また、映像フレーム内に注目領域が存在しない場合は、優先順位（３）、（４）、（６）及び（８）は、考慮しないものとする
【００４０】
また、優先順位（１）〜（４）において、最小階調を５ビットとしたが、処理対象映像の解像度に依存して変更することとしてもよい。例えば、ハイビジョン方式やＮＴＳＣ方式の放送映像の場合は最小階調を６ビットとし、ＳＩＦ（水平３５２×垂直２４０画素）やＱＳＩＦ（水平１７６×垂直１２０画素）の場合は最小階調を５ビットとする。
図１に戻って説明を続ける。
【００４１】
階調削減部２２は、階調設定部２１で設定されたマクロブロック毎の階調の削除量（削減情報）に基づいて、各マクロブロックの階調を削減するものである。この階調削減部２２で階調を削減された映像データは、圧縮を行った映像データとして出力される（出力映像データ）。例えば、映像データの画素が８ビットで構成されており、階調設定部２１から通知される削減情報において、あるマクロブロックの階調の削減量が２ビットであった場合、階調削減部２２は、そのマクロブロックの階調を６（８マイナス２）ビットとする。これによって、映像データの情報量を圧縮することができる。
【００４２】
以上、一実施形態に基づいて、映像データ圧縮装置１の構成について説明したが、本発明はこれに限定されるものではない。例えば、領域分割制御手段１０で行う注目領域の抽出は、動きベクトルを用いる以外にも、注目領域と背景領域の色の特徴量が異なる場合は、特定の階調値を閾値として注目領域を抽出することも可能である。この閾値による注目領域の抽出では、注目領域の被写体は動いている必要はない。
【００４３】
また、映像データ圧縮装置１は、コンピュータにおいて各手段を各機能プログラムとして実現することも可能であり、各機能プログラムを結合して映像データ圧縮プログラムとして動作させることも可能である。
【００４４】
（映像データ圧縮装置１の動作）
次に、図１及び図５を参照して、映像データ圧縮装置１の動作について説明する。図５は、映像データ圧縮装置１の動作を示すフローチャートである。
［領域分割ステップ］
まず、映像データ圧縮装置１は、映像遅延部１１ａによって、入力された映像データ（入力映像データ）を１映像フレーム分遅延させる（ステップＳ１）。そして、動きベクトル検出部１１ｂによって、入力映像データの映像フレームと映像遅延部１１ａで遅延された１映像フレーム前の映像フレームとの間（隣接映像フレーム間）で、マクロブロック毎にブロックマッチングを行うことで動きベクトルを検出する（ステップＳ２）。
【００４５】
この動きベクトル検出部１１ｂで検出された動きベクトルに基づいて、グローバルベクトル算出部１２が、複数の動きベクトルの中で、最も多く検出された動きベクトルをグローバルベクトルとして算出する（ステップＳ３）。このグローバルベクトルは背景領域の動きベクトルとみなすことができる。
【００４６】
そして、映像データ圧縮装置１は、領域識別部１３によって、ステップＳ２で検出したマクロブロックの動きベクトルと、ステップＳ３で算出したグローバルベクトルとを比較して、両ベクトルの差が予め設定した値（例えば、映像フレーム当たり４画素）以上であるマクロブロックを注目領域に含まれるものとして識別する。これによって、映像フレームを注目領域と背景領域とに分割する（ステップＳ４）。
【００４７】
［階調削減ステップ］
そして、映像データ圧縮装置１は、階調設定部２１によって、キーボード等の入力手段（図示せず）から入力された映像データの削減量（映像削減量）に基づいて、背景領域の階調が注目領域の階調よりも低くなるように各領域のマクロブロックの階調削減量を設定する（ステップＳ５）。このとき、階調削減量は特定の優先順位（図３参照）に基づいて、設定するものとする。
【００４８】
そして、映像データ圧縮装置１は、ステップＳ５で階調削減量を設定されたマクロブロックは、階調削減部２２によって、その設定された階調削減量分の階調を削減し（ステップＳ６）、そのマクロブロック毎に階調を削減した映像フレームを時系列に圧縮映像データ（出力映像データ）として出力する（ステップＳ７）。そして、映像データ（入力映像データ）の入力が終了したかどうかを判定し（ステップＳ８）、終了した場合（Ｙｅｓ）は、動作を終了する。一方、まだ映像データが入力される場合（ステップＳ８でＮｏ）は、ステップＳ１へ戻って動作を継続する。
【００４９】
以上の各ステップによって、映像データ圧縮装置１は、入力映像データ内の注目領域と背景領域とを識別して、各々の領域の階調を独立して削減することが可能になる。そして、映像データ圧縮装置１で削減し圧縮された映像データは、注目領域の画質の劣化を背景領域よりも軽減した映像データとなる。
【００５０】
（映像データ圧縮装置の構成：第二の実施の形態）
次に、図２を参照して、本発明における第二の実施の形態である映像データ圧縮装置１Ｂについて説明する。図２は、映像データ圧縮装置１Ｂの構成を示したブロック図である。図２に示した映像データ圧縮装置１Ｂは、入力された映像データを、動きのある注目領域とそれ以外の領域である背景領域とに分割し、その分割された領域毎に画素値を表現するための階調を削減することで、映像データを圧縮して出力するものである。
【００５１】
映像データ圧縮装置１Ｂは、映像データ圧縮装置１（図１）のグローバルベクトル算出部１２の代わりにカメラデータベクトル算出部１４を付加し、さらに領域拡張部１５を追加して構成した。この追加したカメラデータベクトル算出部１４及び領域拡張部１５の構成、並びに領域識別部１３Ｂの機能以外は、図１に示した映像データ圧縮装置１と同一のものであるので、同一の符号を付し、説明は省略する。
【００５２】
カメラデータベクトル算出部１４は、映像データ（入力映像データ）を撮影したときの撮影カメラ（図示せず）のパン、チルト、ズーム等のカメラデータに基づいて、入力映像データの映像フレームに動きのある注目領域が存在しないと仮定したときのマクロブロックの動きベクトル（背景動きベクトル）を算出するものである。なお、このカメラデータは、入力映像データに連動して時系列に入力されるデータである。このカメラデータベクトル算出部１４で算出された背景動きベクトルは、領域識別部１３Ｂに出力される。
【００５３】
このカメラデータベクトル算出部１４におけるカメラデータを用いた動きベクトルの算出は、例えば、「鄭文濤等，“Ａ　Ｈｉｇｈ−Ｐｒｉｃｉｓｉｏｎ　Ｃａｍｅｒａ　Ｏｐｅｒａｔｉｏｎ　Ｐａｒａｍｅｔｅｒ　Ｍｅａｓｕｒｅｍｅｎｔ　Ｓｙｓｔｅｍ　ａｎｄ　Ｉｔｓ　Ａｐｐｌｉｃａｔｉｏｎ　ｔｏ　Ｉｍａｇｅ　Ｍｏｔｉｏｎ　Ｉｎｆｅｒｒｉｎｇ”，ＩＥＥＥ　Ｔｒａｎｓａｃｔｉｏｎｓ　ｏｎ　Ｂｒｏａｄｃａｓｔｉｎｇ，Ｖｏｌ．４７，Ｎｏ．１，ｐ．４６−５５，Ｍａｒｃｈ　２００１」で開示されている技術を用いることができる。
【００５４】
すなわち、カメラデータベクトル算出部１４では、カメラの動き（パン、チルト、ズーム等）によって、映像フレーム内のあるマクロブロックが当該映像フレームのどこに移動するかを算出し、そのマクロブロックの移動方向及び移動量を背景動きベクトルとする。例えば、カメラを画面の右方向にパンすると、背景として映っている領域は左方向に移動したように見える。この移動した領域は映像フレーム内では動きを持っているが、実際には背景領域となるものである。このように、カメラデータベクトル算出部１４は、カメラの動きによる背景の動きベクトルを算出する。
【００５５】
領域識別部１３Ｂは、動きベクトル検出部１１ｂで検出されたマクロブロック毎の動きベクトルと、カメラデータベクトル算出部１４で算出された背景動きベクトルとに基づいて、マクロブロックが動きのある注目領域に含まれるものか、それ以外の領域である背景領域に含まれるものかを識別するものである。この領域識別部１３Ｂで識別されたマクロブロック毎の領域（注目領域又は背景領域）は、マクロブロックの座標とともに領域情報として領域拡張部１５へ出力される。
【００５６】
この領域識別部１３Ｂでは、動きベクトル検出部１１ｂで検出されたマクロブロック毎の動きベクトルとカメラデータベクトル算出部１４で算出された背景動きベクトルとを比較することで、背景動きベクトル以外の動きをするマクロブロックを注目領域に含まれるマクロブロックであると判断する。
【００５７】
領域拡張部（領域拡張手段）１５は、領域識別部１３Ｂから出力される領域情報に基づいて、注目領域と背景領域とが隣接する領域で、その両領域のマクロブロックの相関を調べ、予め設定した相関値よりも高い場合に、その背景領域のマクロブロックを注目領域のマクロブロックとみなして、注目領域の拡張を行うものである。例えば、各マクロブロックの輝度、色等の特徴量を比較することで相関を調べる。この領域拡張部１５で注目領域の拡張を行った領域情報は、階調削減制御手段２０の階調設定部２１へ出力される。
【００５８】
以上、映像データ圧縮装置１Ｂの構成について説明したが、映像データ圧縮装置１Ｂは、コンピュータにおいて各手段を各機能プログラムとして実現することも可能であり、各機能プログラムを結合して映像データ圧縮プログラムとして動作させることも可能である。
【００５９】
（映像データ圧縮装置１Ｂの動作）
次に、図２及び図６を参照して、映像データ圧縮装置１Ｂの動作について説明する。図６は、映像データ圧縮装置１Ｂの動作を示すフローチャートである。
まず、映像データ圧縮装置１Ｂは、映像遅延部１１ａによって、入力された映像データ（入力映像データ）を１映像フレーム分遅延させる（ステップＳ１０）。そして、動きベクトル検出部１１ｂによって、入力映像データの映像フレームと映像遅延部１１ａで遅延された１映像フレーム前の映像フレームとの間（隣接映像フレーム間）で、マクロブロック毎にブロックマッチングを行うことで動きベクトルを検出する（ステップＳ１１）。
【００６０】
そして、映像データ圧縮装置１Ｂは、カメラデータベクトル算出部１４によって、映像データ（入力映像データ）を撮影したときの撮影カメラ（図示せず）のパン、チルト、ズーム等のカメラデータに基づいて、背景領域が映像フレーム内で移動する動きベクトル（背景動きベクトル）を算出する（ステップＳ１２）。
【００６１】
次に、映像データ圧縮装置１Ｂは、領域識別部１３Ｂによって、ステップＳ１１で検出したマクロブロックの動きベクトルと、ステップＳ１２で算出した背景動きベクトルとを比較して、背景動きベクトルとは異なる動きベクトルを持つマクロブロックを注目領域に含まれるマクロブロックとして識別する。これによって、映像フレームを注目領域と背景領域とに分割する（ステップＳ１３）。
【００６２】
さらに、映像データ圧縮装置１Ｂは、領域拡張部１５によって、注目領域と背景領域とが隣接する領域で、その両領域のマクロブロックの相関を調べ、予め設定した相関値よりも高い場合に、その背景領域のマクロブロックを注目領域のマクロブロックとみなして、注目領域の拡張を行う（ステップＳ１４）。
なお、これ以降の動作は、図５の階調削除ステップ（ステップＳ５以降）と同様であるので説明は省略する。
【００６３】
以上の各ステップによって、映像データ圧縮装置１Ｂは、移動カメラ等のようなカメラを動作させて被写体を撮影した映像データに対して、その入力映像データ内の注目領域と背景領域とを識別して、各々の領域の階調を独立して削減することが可能になる。そして、映像データ圧縮装置１Ｂで階調を削減し圧縮された映像データは、注目領域の画質の劣化を背景領域よりも軽減した映像データとなる。
【００６４】
【発明の効果】
以上説明したとおり、本発明に係る映像データ圧縮装置、その方法及びそのプログラムでは、以下に示す優れた効果を奏する。
【００６５】
請求項１、請求項６又は請求項７に記載の発明によれば、入力映像データから注目領域を抽出して、注目領域とそれ以外の領域である背景領域とを識別して、各々の領域の階調を個別に削減することが可能になる。これによって、注目領域の階調よりも背景領域の階調を低くすることで、注目領域の画質の劣化を軽減すし、映像データの圧縮率を高めることができる。
【００６６】
例えば、携帯端末等の小さい画面では、映像全体を鑑賞することよりも映像データに含まれる情報を得ることが重要であるため、その情報を含んだ注目領域の画質の劣化を軽減することは、映像データを配信するサービスにおいて有効である。
【００６７】
請求項２に記載の発明によれば、動きのある領域をブロック単位で注目領域として認識することが可能になる。これによって、注目領域とそれ以外の領域である背景領域との階調を個別にブロック単位で容易に削減することができる。
【００６８】
請求項３に記載の発明によれば、動きベクトルによって、背景領域であると認識された領域であっても、色等の特徴量によって注目領域として判定することが可能になる。これによって、例えば、人間が動いているにも関わらず、洋服の端等で動きが少ない領域を注目領域として認識することが可能になる。
【００６９】
請求項４に記載の発明によれば、注目領域の階調よりも背景領域の階調を低くすることで、従来の圧縮と比較して、圧縮データに占める注目領域の比率を高めることで注目領域の画質の劣化を軽減するとともに、映像データの圧縮率を高めることができる。
【００７０】
請求項５に記載の発明によれば、人間の視覚が輝度成分に比べて色差成分の感度が低いため、注目領域及び背景領域毎に、輝度成分に割り当てる階調よりも色差成分に割り当てる階調を低くすることで、画質の劣化を抑えたままで圧縮率を高めることができる。
【図面の簡単な説明】
【図１】本発明の第一の実施の形態に係る映像データ圧縮装置の全体構成を示すブロック図である。
【図２】本発明の第二の実施の形態に係る映像データ圧縮装置の全体構成を示すブロック図である。
【図３】マクロブロックの構成例を説明するための説明図である。
【図４】階調設定部において階調を削減する階調削減内容とその優先順位を説明するための説明図である。
【図５】本発明の第一の実施の形態に係る映像データ圧縮装置の動作を示すフローチャートである。
【図６】本発明の第二の実施の形態に係る映像データ圧縮装置の動作を示すフローチャートである。
【図７】注目領域及び背景領域の一例を説明するための説明図である。
【符号の説明】
１、１Ｂ……映像データ圧縮装置
１０、１０Ｂ……領域分割制御手段
１１……動きベクトル算出部（動きベクトル算出手段）
１１ａ……映像遅延部
１１ｂ……動きベクトル検出部
１２……グローバルベクトル算出部
１３、１３Ｂ……領域識別部（領域識別手段）
１４……カメラデータベクトル算出部
１５……領域拡張部（領域拡張手段）
２０……階調削減制御手段
２１……階調設定部
２２……階調削減部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a technique for compressing video data, and more particularly, to a video data compression apparatus, a method, and a program for compressing video data by reducing image quality deterioration of a region of interest in the video data.
[0002]
[Prior art]
At present, services for distributing video data to information terminals such as mobile terminals and PDAs (Personal Digital Assistants) have begun to spread. In this case, even if an attempt is made to distribute high-definition video data, the bandwidth of the transmission path for distributing the video data is limited, so that the video data is compressed using MPEG-4 (Moving Picture Experts Group 4) or the like. Has been delivered.
[0003]
Conventionally, with regard to the video data compression technique, for example, as in MPEG-2 (Moving Picture Experts Group 2), motion compensation inter-frame prediction (MC: Motion Compensation) and discrete cosine transform (DCT: Discrete Cosine Transform) Is generally used.
[0004]
That is, in the compression of video data according to MPEG-2 or the like, first, a prediction error between video frames of video data is generated in units of 16 × 16 pixels (macroblock) by motion compensation inter-frame prediction, and the prediction error is discrete. By performing the cosine transform, a DCT coefficient indicating the amplitude of the frequency component is generated. Utilizing the fact that human visual sensitivity to high-frequency components is weak, video data is compressed by reducing the number of digits of DCT coefficients of high-frequency components.
[0005]
[Problems to be solved by the invention]
However, in the above-described conventional technique, compression of video data such as MPEG-2 reduces the amount of information depending on the size of a motion vector in motion compensation inter-frame prediction or the frequency component of a macroblock. It did not consider the content of the data.
[0006]
For this reason, when attempting to distribute high-definition video data to a small portable terminal such as a portable terminal or a PDA, a large amount of information is reduced by band compression, and a display for displaying the video data on the distributed portable terminal is performed. The image is an image in which the image quality is deteriorated on the entire screen. In other words, the display image displayed on the mobile terminal has a problem that the image quality of a region of interest in which a notable subject or the like in the image is deteriorated to the same extent as the background region, which is the other region. Was.
[0007]
The present invention has been made in view of the above-described problems, and when image data is compressed, image quality degradation of a region of interest in which a notable subject or the like in the image data is projected is reduced. It is an object of the present invention to provide a video data compression device, a method thereof, and a program thereof, which are capable of reducing the compression ratio of video data while reducing the size of the video data from the background region.
[0008]
[Means for Solving the Problems]
SUMMARY OF THE INVENTION The present invention has been made to achieve the above object. First, a video data compression apparatus according to claim 1 provides, for input video data, a gradation indicating the number of colors that each pixel can represent. A video data compression apparatus for compressing the video data by reducing the video data, wherein the video data compression device divides the video data into a region of interest where a subject is present for each video frame and a background region other than the region. And a gradation reduction control means for individually reducing the gradation for each of the attention area and the background area divided by the area division control means.
[0009]
According to this configuration, the video data compression device divides the video data into the attention area and the background area other than the attention area for each video frame by the area division control unit. At this time, if the attention area is a moving area, a motion vector of the attention area is obtained, the attention area and the background area are identified, and the area is divided. Further, when the attention area can be identified by characteristics such as colors as compared with the background area, the area may be divided by performing threshold processing based on the characteristics such as colors.
[0010]
Then, in the video data compression device, for each of the regions divided by the region division control unit, the gradation of the region is individually reduced by the gradation reduction control unit. For example, when compressing video data by reducing video data, reducing the gradation of the background region more than the gradation of the region of interest reduces the degradation of the image quality of the region of interest to the degradation of the image quality of the background region. Acts to suppress.
[0011]
The video data compression apparatus according to claim 2 is the video data compression apparatus according to claim 1, wherein the area division control unit is configured to control each block of a specific size between successive video frames of the video data. A motion vector calculation unit for calculating a motion vector, and based on the motion vector calculated by the motion vector calculation unit, the block in the video frame is a block included in the attention area or the background other than the block. And an area identifying means for identifying whether the block is included in the area.
[0012]
According to this configuration, the video data compression device calculates a motion vector for each block of a specific size between continuous video frames of video data by the motion vector calculation unit. For example, this block is a macroblock used for motion compensation prediction such as MPEG-2. Then, based on the magnitude of the motion vector calculated by the motion vector calculating means, the block (macroblock) in the video frame is determined by the area identifying means to be a block included in the moving attention area or to a background area other than the block. Is included in the block. This makes it possible to divide a video frame into a target area and a background area in units of blocks (macroblocks), and to independently process (reduce) the video data of the target area and the background area.
[0013]
Further, in the video data compression device according to claim 3, in the video data compression device according to claim 2, the region division control unit determines that the region of interest and the background region identified by the region identification unit are different from each other. In an adjacent area, the image processing apparatus is provided with an area expanding unit that expands the block of the background area as the block of the area of interest based on the correlation between the block of the area of interest and the block of the background area.
[0014]
According to this configuration, the video data compression device uses the area expansion unit to perform similarity in the correlation between adjacent blocks (macroblocks), for example, in a feature amount such as luminance or color in an area where the attention area and the background area are adjacent. A block in a background region having a characteristic is defined as a block in a region of interest. As a result, it is possible to extend a region with little motion that is not recognized as a region of interest in the detection of a motion vector as a region of interest.
[0015]
According to a fourth aspect of the present invention, in the video data compression apparatus according to any one of the first to third aspects, the gradation reduction control unit is assigned to the background area. It is characterized in that the gradation is reduced more than the gradation assigned to the attention area.
[0016]
According to this configuration, the video data compression device causes the gradation reduction control unit to set the gradation assigned to the background area lower than the gradation assigned to the attention area. This makes it possible to increase the compression ratio of the region of interest when compressing video data.
[0017]
Further, in the video data compression apparatus according to claim 5, in the video data apparatus according to any one of claims 1 to 4, the gradation reduction control unit is configured to control the gradation reduction control unit for each of the attention area and the background area. The present invention is characterized in that the tone assigned to the color difference component is reduced more than the tone assigned to the luminance component.
[0018]
According to this configuration, in the video data compression device, the tone reduction control unit sets the tone assigned to the color difference component for each attention area and the background area lower than the tone assigned to the luminance component. Thus, when compressing the video data, the sensitivity of the chrominance component is lower for the human visual sense than the luminance component, so that it is possible to increase the compression efficiency while suppressing the deterioration of the image quality of the video data.
[0019]
Further, the video data compression method according to claim 6 is a video data compression method for compressing the video data by reducing the gradation indicating the number of colors that can be represented by each pixel in the input video data. An area dividing step of dividing the video data into an attention area where a subject is present for each video frame and a background area other than the attention area; and an area division step for each of the attention area and the background area divided in the area division step. A tone reduction step of individually reducing the tone, wherein the tone reduction step reduces the tone assigned to the background area more than the tone assigned to the attention area. It is characterized by doing.
[0020]
According to this method, the video data compression method divides the video data into a region of interest and a background region other than the region of interest for each video frame in the region dividing step. At this time, if the attention area is a moving area, a motion vector of the attention area is obtained, the attention area and the background area are identified, and the area is divided. Further, when the attention area can be identified by characteristics such as colors as compared with the background area, the area may be divided by performing threshold processing based on the characteristics such as colors.
[0021]
Then, in the video data compression method, in the gradation reduction step, for each region divided in the region division step, the gradation of the region is individually reduced. At this time, by reducing the gradation of the background area more than the gradation of the attention area, the image quality of the attention area is reduced more than the image quality of the background area.
[0022]
Furthermore, the video data compression program according to claim 7 reduces a gradation indicating the number of colors that can be represented by each pixel with respect to the input video data, and compresses the video data. A region division control unit that divides the video data into a region of interest in which a subject is present for each video frame and a background region other than the region, for each of the region of interest and the background region divided by the region division control unit, Functioning as tone reduction control means for individually reducing the tone, the tone reduction control means reducing the tone assigned to the background area more than the tone assigned to the attention area It is characterized by doing.
[0023]
According to this configuration, the video data compression program divides the video data into the attention area and the background area other than the attention area for each video frame by the area division control means, and controls the area division control by the gradation reduction control means. For each area divided by the means, the gradation of the area is individually reduced. At this time, by reducing the gradation of the background area more than the gradation of the attention area, the degradation of the image quality of the attention area is suppressed as compared with the degradation of the image quality of the background area.
[0024]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
(Configuration of Video Data Compression Device: First Embodiment)
FIG. 1 is a block diagram showing a configuration of a video data compression device 1 according to a first embodiment of the present invention. The video data compression device 1 shown in FIG. 1 divides input video data into a moving region of interest and a background region, which is another region, and expresses a pixel value for each of the divided regions. The image data is compressed and output by reducing the number of gradations required for the image processing, and is configured to include a region division control unit 10 and a gradation reduction control unit 20.
[0025]
Here, the region of interest refers to a region (attention region FG) where a person or the like appears on the video frame F as shown in FIG. 7 and the background region is a region other than the region of interest FG (background region). BG). Note that a plurality of attention areas FG may exist on the video frame F as shown in FIG.
[0026]
The area division control unit 10 divides the input video data into blocks each having a specific size by determining whether the input video data is an attention area having motion or a background area other than the motion area. Is what you do. Here, the area division control means 10 includes a motion vector calculation unit 11, a global vector calculation unit 12, and an area identification unit 13.
[0027]
Further, the gradation reduction control means 20 individually reduces the gradation for expressing the pixel values of the attention area and the background area of the input video data. The gradation reduction control means 20 includes a gradation setting unit 21 and a gradation reduction unit 22.
Here, a block of a specific size is a macroblock (16 × 16 pixels) used for motion compensation prediction such as MPEG-2.
[0028]
The motion vector calculation unit (motion vector calculation unit) 11 calculates a motion vector from video frames of video data that is continuously input. Here, the motion vector calculation unit 11 is composed of a video delay unit 11a and a motion vector detection unit 11b.
[0029]
The video delay unit 11a delays input video data (input video data) in video frame units. The video data (delayed video data) delayed by one video frame in the video delay unit 11a is output to the motion vector detection unit 11b.
[0030]
The motion vector detecting unit 11b detects a motion vector in macroblock units of a video frame based on the input video data (input video data) and the delayed video data delayed by the video delay unit 11a. . The motion vector detected by the motion vector detection section 11b is output to the global vector calculation section 12 and the area identification section 13. The motion vector is obtained by a block matching method for each macroblock between the video frame of the input video data and the video frame of the delayed video data, that is, between adjacent video frames.
[0031]
The global vector calculation unit 12 calculates the most frequently detected motion vector among the plurality of motion vectors as a global vector based on the motion vector for each macroblock input from the motion vector detection unit 11b. is there. The global vector calculated by the global vector calculation unit 12 is output to the area identification unit 13. The global vector calculated here can be regarded as a motion vector of a background area having the largest area in the input video data.
[0032]
The region identification unit (region identification means) 13 has a macroblock having a motion based on the motion vector for each macroblock detected by the motion vector detection unit 11b and the global vector calculated by the global vector calculation unit 12. It is to identify whether it is included in the attention area or in the background area which is the other area. The region (attention region or background region) for each macroblock identified by the region identification unit 13 is output to the gradation setting unit 21 of the gradation reduction control unit 20 as region information together with the coordinates of the macroblock.
[0033]
Here, a video (macroblock) that moves differently from the global vector is regarded as a region of interest. For example, comparing the motion vector of each macroblock with the global vector, if the difference between the two vectors is equal to or greater than a preset value (for example, 4 pixels per video frame), the macroblock is included in the attention area. Judge.
[0034]
The gradation setting unit 21 sets a gradation reduction amount for each macroblock of the attention area and the background area identified by the area identification unit 13 of the area division control unit 10, and sets the reduction amount as reduction information together with the macroblock coordinates. This is output to the tone reduction unit 22. Here, based on C / N (Carrier to Noise Ratio) information of a transmission path for transmitting video (video data), buffer occupation information of a network server that stores video, and the like, a reduction amount of video data (video The amount of reduction is determined in advance, and the image reduction amount is input to the gradation setting unit 21 from input means (not shown) such as a keyboard. Then, the gradation setting unit 21 sets the gradation reduction amount of each region based on the image reduction amount such that the gradation of the background region is lower than the gradation of the attention region.
[0035]
Here, with reference to FIG. 3 and FIG. 4, a method of setting the gradation reduction amount of the attention area and the background area in the gradation setting unit 21 will be described. Here, it is assumed that the video data is a YC (luminance / color difference) video signal, and its gradation is expressed by 8 bits. FIG. 3 shows a configuration of a macroblock in MPEG-2 as an example of a YC (luminance / color difference) video signal. FIG. 4 shows the priorities of the deletion contents for reducing the gradation.
[0036]
As shown in FIG. 3, in MPEG-2, a macroblock is composed of a 16 × 16 pixel Y (luminance) video signal and an 8 × 8 pixel C (color difference) video signal (C _r Video signal and C _b Video signal). Here, to reduce the gradation means to reduce the number of bits representing each pixel, thereby reducing the level that the pixel can represent. For example, by reducing 3 bits from the original pixel B1 capable of expressing an image of 256 gradations with 8 bits, the reduced pixel B2 can only express up to 32 gradations with 5 bits.
[0037]
Then, as shown in FIG. 4, the gradation setting unit 21 (FIG. 1) sets the gradation reduction amount to each macroblock in the order of priority (1) to (8).
In the priority order (1), the gradation of the C (color difference) video signal in the background area is set to be reduced, and in the priority order (2), the gradation of the Y (luminance) video signal in the background area is reduced. Set to. In the priority order (3), the gradation of the C (color difference) video signal in the attention area is set to be reduced, and in the priority order (4), the gradation of the Y (luminance) video signal in the attention area is reduced. Set to It should be noted that the above-described reduction can be performed until the gradation becomes a minimum of 5 bits.
[0038]
Here, the reason why the reduction of the C (color difference) video signal is prioritized over the reduction of the Y (luminance) video signal is that human vision has a characteristic that the sensitivity of the color difference component is lower than the luminance component. It is. Further, the reason why the minimum gradation is set to 5 bits here is that it is reported that if the gradation is reduced to 4 bits or less for an original image having 8 bits of gradation, the image quality is remarkably deteriorated (see References). : Otsuka et al., "Temporal / spatial / gradation resolution and TV image quality," IEICE Technical Committee on Image Engineering, IE 87-114, pp. 17-24, 1987).
[0039]
If further reduction of the gradation is required, the priority (5) is to reduce the gradation of the C (color difference) video signal in the background area to less than 5 bits (up to a minimum of 0 bits), and then to the priority (6). The gradation of the C (color difference) video signal in the region of interest is reduced to less than 5 bits (to a minimum of 0 bits). In the priority order (7), the gradation of the Y (luminance) video signal in the background area is reduced to less than 5 bits (to a minimum of 0 bits). In the priority order (8), the Y (luminance) video signal in the attention area is reduced. The signal gradation is reduced to less than 5 bits (to a minimum of 0 bits).
The priority (6) and the priority (7) may be reversed. If there is no region of interest in the video frame, the priorities (3), (4), (6) and (8) are not considered.
[0040]
In the priorities (1) to (4), the minimum gradation is 5 bits, but may be changed depending on the resolution of the video to be processed. For example, in the case of a high-definition or NTSC broadcast video, the minimum gradation is 6 bits, and in the case of SIF (352 horizontal × 240 vertical pixels) or QSIF (176 horizontal × 120 vertical pixels), the minimum gray scale is 5 bits. I do.
Returning to FIG. 1, the description will be continued.
[0041]
The gradation reduction unit 22 reduces the gradation of each macroblock based on the amount of gradation deletion (reduction information) for each macroblock set by the gradation setting unit 21. The video data whose gradation has been reduced by the gradation reduction unit 22 is output as compressed video data (output video data). For example, if the pixel of the video data is composed of 8 bits and the reduction information notified from the gradation setting unit 21 indicates that the reduction amount of the gradation of a certain macroblock is 2 bits, the gradation reduction unit 22 Sets the gradation of the macro block to 6 (8 minus 2) bits. Thus, the information amount of the video data can be compressed.
[0042]
As described above, the configuration of the video data compression device 1 has been described based on one embodiment, but the present invention is not limited to this. For example, the attention area extraction performed by the area division control means 10 may be performed by using a specific gradation value as a threshold when the attention area and the background area have different color feature amounts besides using the motion vector. It is also possible. In the extraction of the attention area using the threshold value, the subject in the attention area does not need to be moving.
[0043]
Further, in the video data compression device 1, each means in the computer can be realized as each functional program, and each functional program can be combined and operated as a video data compression program.
[0044]
(Operation of Video Data Compression Device 1)
Next, the operation of the video data compression device 1 will be described with reference to FIGS. FIG. 5 is a flowchart showing the operation of the video data compression device 1.
[Region division step]
First, the video data compression device 1 causes the video delay unit 11a to delay the input video data (input video data) by one video frame (Step S1). Then, the motion vector detecting unit 11b performs block matching for each macroblock between the video frame of the input video data and the video frame one video frame delayed by the video delay unit 11a (between adjacent video frames). Thus, a motion vector is detected (step S2).
[0045]
Based on the motion vectors detected by the motion vector detection unit 11b, the global vector calculation unit 12 calculates the most frequently detected motion vector among the plurality of motion vectors as a global vector (step S3). This global vector can be regarded as a motion vector of the background area.
[0046]
Then, the video data compression device 1 compares the motion vector of the macroblock detected in step S2 with the global vector calculated in step S3 by the area identifying unit 13 and determines the difference between the two vectors as a preset value ( For example, a macroblock having 4 pixels or more per video frame) is identified as being included in the attention area. Thus, the video frame is divided into the attention area and the background area (Step S4).
[0047]
[Gradation reduction step]
Then, in the video data compression device 1, the gradation setting unit 21 changes the gradation of the background area based on the reduction amount (video reduction amount) of the video data input from input means (not shown) such as a keyboard. The amount of gradation reduction of the macro block in each region is set so as to be lower than the gradation of the region of interest (step S5). At this time, the gradation reduction amount is set based on a specific priority (see FIG. 3).
[0048]
Then, in the video data compression device 1, for the macroblock for which the gradation reduction amount is set in step S5, the gradation reduction unit 22 reduces the gradation by the set gradation reduction amount (step S6). Then, a video frame whose gradation is reduced for each macro block is output in time series as compressed video data (output video data) (step S7). Then, it is determined whether or not the input of the video data (input video data) has been completed (step S8). If the input has been completed (Yes), the operation is terminated. On the other hand, if video data is still input (No in step S8), the process returns to step S1 to continue the operation.
[0049]
Through the above steps, the video data compression device 1 can identify the attention area and the background area in the input video data, and can independently reduce the gradation of each area. Then, the video data reduced and compressed by the video data compression device 1 becomes video data in which the deterioration of the image quality of the attention area is reduced more than that of the background area.
[0050]
(Configuration of Video Data Compression Device: Second Embodiment)
Next, a video data compression device 1B according to a second embodiment of the present invention will be described with reference to FIG. FIG. 2 is a block diagram showing a configuration of the video data compression device 1B. The video data compression device 1B shown in FIG. 2 divides the input video data into a moving area of interest and a background area, which is another area, and expresses a pixel value for each of the divided areas. In this case, the video data is compressed and output by reducing the number of gray scales required.
[0051]
The video data compression device 1B is configured by adding a camera data vector calculation unit 14 instead of the global vector calculation unit 12 of the video data compression device 1 (FIG. 1), and further adding an area extension unit 15. Except for the added configuration of the camera data vector calculation unit 14 and the area expansion unit 15 and the function of the area identification unit 13B, the image data compression apparatus 1 is the same as the video data compression apparatus 1 shown in FIG. The description is omitted.
[0052]
The camera data vector calculation unit 14 converts a motion of a video frame of the input video data into a video frame of the input video data based on camera data such as pan, tilt, and zoom of a video camera (not shown) at the time of capturing the video data (input video data). This is to calculate a motion vector (background motion vector) of a macroblock when it is assumed that a certain attention area does not exist. The camera data is data that is input in chronological order in conjunction with the input video data. The background motion vector calculated by the camera data vector calculation unit 14 is output to the area identification unit 13B.
[0053]
The calculation of the motion vector using the camera data in the camera data vector calculation unit 14 is performed, for example, in “A High-Prication Camera Operational Parameter Measurement System and It's Application to Electronic Transactions, Information and Information Technology 47, No. 1, pp. 46-55, March 2001 ".
[0054]
That is, the camera data vector calculation unit 14 calculates where a certain macroblock in a video frame moves in the video frame by the movement of the camera (pan, tilt, zoom, etc.), and determines the moving direction and the moving direction of the macroblock. The moving amount is set as a background motion vector. For example, when the camera is panned to the right of the screen, the area reflected as the background appears to move to the left. The moved area has a motion in the video frame, but is actually a background area. As described above, the camera data vector calculation unit 14 calculates the background motion vector due to the camera motion.
[0055]
Based on the motion vector for each macroblock detected by the motion vector detection unit 11b and the background motion vector calculated by the camera data vector calculation unit 14, the region identification unit 13B determines that the macroblock This is to identify whether the image is included in the background area or the other area. The region (attention region or background region) for each macroblock identified by the region identification unit 13B is output to the region extension unit 15 as region information together with the macroblock coordinates.
[0056]
The area identification unit 13B compares the motion vector for each macroblock detected by the motion vector detection unit 11b with the background motion vector calculated by the camera data vector calculation unit 14 to determine the motion other than the background motion vector. Is determined to be a macroblock included in the attention area.
[0057]
The region extension unit (region extension unit) 15 checks the correlation between the macroblocks of the region adjacent to the attention region and the background region based on the region information output from the region identification unit 13B, and sets the correlation between the two regions. If the correlation value is higher than the calculated correlation value, the macroblock in the background area is regarded as a macroblock in the attention area, and the attention area is extended. For example, the correlation is examined by comparing the feature amounts of each macroblock such as luminance and color. The region information obtained by expanding the region of interest by the region expanding unit 15 is output to the gradation setting unit 21 of the gradation reduction control unit 20.
[0058]
The configuration of the video data compression device 1B has been described above. However, in the video data compression device 1B, it is also possible to realize each unit in the computer as each function program, and combine each function program to form a video data compression program. It is also possible to operate.
[0059]
(Operation of Video Data Compression Device 1B)
Next, the operation of the video data compression device 1B will be described with reference to FIGS. FIG. 6 is a flowchart showing the operation of the video data compression device 1B.
First, the video data compression device 1B delays the input video data (input video data) by one video frame by the video delay unit 11a (Step S10). Then, the motion vector detecting unit 11b performs block matching for each macroblock between the video frame of the input video data and the video frame one video frame delayed by the video delay unit 11a (between adjacent video frames). Thus, a motion vector is detected (step S11).
[0060]
Then, the video data compression device 1B uses the camera data vector calculation unit 14 based on camera data such as pan, tilt, and zoom of a photographic camera (not shown) at the time of capturing the video data (input video data). A motion vector (background motion vector) in which the background area moves within the video frame is calculated (step S12).
[0061]
Next, the video data compression apparatus 1B compares the motion vector of the macroblock detected in step S11 with the background motion vector calculated in step S12 by the area identifying unit 13B, and determines a motion vector different from the background motion vector. Is identified as a macroblock included in the attention area. Thus, the video frame is divided into the attention area and the background area (Step S13).
[0062]
Furthermore, the video data compression device 1B checks the correlation between the macroblocks of the two regions in the region where the attention region and the background region are adjacent to each other by the region expansion unit 15, and when the correlation is higher than a predetermined correlation value, The macro block in the background area is regarded as the macro block in the area of interest, and the area of interest is extended (step S14).
Note that the subsequent operation is the same as the gradation deletion step (step S5 and subsequent steps) in FIG. 5, and a description thereof will be omitted.
[0063]
Through the above steps, the video data compression device 1B identifies the attention area and the background area in the input video data for the video data obtained by photographing the subject by operating a camera such as a moving camera. , It is possible to independently reduce the gradation of each region. The video data compressed and reduced in gradation by the video data compression device 1B is video data in which the deterioration of the image quality of the attention area is reduced more than that of the background area.
[0064]
【The invention's effect】
As described above, the video data compression apparatus, method, and program according to the present invention have the following excellent effects.
[0065]
According to the first, sixth, or seventh aspect of the present invention, a region of interest is extracted from input video data, a region of interest is distinguished from a background region that is another region, and each region is identified. Can be individually reduced. Thus, by lowering the gradation of the background area than the gradation of the attention area, it is possible to reduce the deterioration of the image quality of the attention area and increase the compression ratio of the video data.
[0066]
For example, on a small screen such as a mobile terminal, it is more important to obtain information included in the video data than to watch the entire video, and thus, it is necessary to reduce the deterioration of the image quality of the attention area including the information. This is effective in a service that distributes video data.
[0067]
According to the invention described in claim 2, it is possible to recognize a moving area as a target area in block units. This makes it possible to easily reduce the gradation of the attention area and the background area, which is the other area, individually in block units.
[0068]
According to the third aspect of the present invention, even a region recognized as a background region by a motion vector can be determined as a region of interest by a feature amount such as a color. As a result, for example, it is possible to recognize, as a region of interest, a region where movement is small, such as at the edge of clothes, even though a person is moving.
[0069]
According to the fourth aspect of the invention, the gradation of the background region is made lower than the gradation of the region of interest, so that the ratio of the region of interest to the compressed data is increased as compared with the conventional compression. It is possible to reduce the deterioration of the image quality of the area and increase the compression ratio of the video data.
[0070]
According to the invention described in claim 5, since the sensitivity of the chrominance component to human vision is lower than that of the luminance component, the gradation assigned to the chrominance component is smaller than the gradation assigned to the luminance component for each attention area and background area. , The compression ratio can be increased while suppressing the deterioration of the image quality.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an overall configuration of a video data compression device according to a first embodiment of the present invention.
FIG. 2 is a block diagram showing an overall configuration of a video data compression device according to a second embodiment of the present invention.
FIG. 3 is an explanatory diagram for describing a configuration example of a macroblock.
FIG. 4 is an explanatory diagram for explaining gradation reduction contents for reducing gradation in a gradation setting unit and their priorities;
FIG. 5 is a flowchart showing an operation of the video data compression device according to the first embodiment of the present invention.
FIG. 6 is a flowchart showing an operation of the video data compression device according to the second embodiment of the present invention.
FIG. 7 is an explanatory diagram for describing an example of an attention area and a background area.
[Explanation of symbols]
1, 1B ... video data compression device
10, 10B ... area division control means
11 ... Motion vector calculation unit (motion vector calculation means)
11a Video delay unit
11b ... Motion vector detection unit
12 ... Global vector calculator
13, 13B... Area identification unit (area identification means)
14 Camera data vector calculation unit
15 Area expansion unit (area expansion means)
20: gradation reduction control means
21: gradation setting section
22: gradation reduction unit

Claims

A video data compression device that compresses the video data by reducing the gradation indicating the number of colors that can be represented by each pixel for the input video data,
An area division control unit that divides the video data into an attention area where a subject is present for each video frame and another background area.
Tone reduction control means for individually reducing the tone for each of the attention area and the background area divided by the area division control means;
A video data compression device, comprising:

The area division control means,
Motion vector calculation means for calculating a motion vector for each block of a specific size between consecutive video frames of the video data,
Area identification means for identifying whether the block in the video frame is a block included in the attention area or a block included in the other background area based on the motion vector calculated by the motion vector calculation means When,
The video data compression device according to claim 1, further comprising:

The area division control means,
In a region where the region of interest and the background region identified by the region identification means are adjacent to each other, the block of the background region is determined based on the correlation between the block of the region of interest and the block of the background region. Area expansion means for expanding as a block,
The video data compression apparatus according to claim 2, comprising:

4. The gray scale reduction control unit according to claim 1, wherein the gray scale assigned to the background area is reduced more than the gray scale assigned to the attention area. 2. The video data device according to claim 1.

2. The image processing apparatus according to claim 1, wherein the tone reduction control unit reduces the tone assigned to the color difference component for each of the attention area and the background area more than the tone assigned to the luminance component. The video data device according to claim 4.

A video data compression method for compressing the video data by reducing the gradation indicating the number of colors that can be represented by each pixel for the input video data,
An area dividing step of dividing the video data into an attention area where a subject is present for each video frame and a background area other than the attention area;
A gradation reduction step of individually reducing the gradation for each of the attention area and the background area divided in the area division step,
The video data compression method, wherein the gradation reduction step reduces a gradation assigned to the background area more than a gradation assigned to the attention area.

For the input video data, reduce the gradation indicating the number of colors that can be represented by each pixel, in order to compress the video data, a computer,
A region division control unit that divides the video data into a region of interest where a subject is present for each video frame and a background region other than that;
For each of the attention area and the background area divided by the area division control means, function as a gradation reduction control means for individually reducing the gradation,
A video data compression program, wherein the tone reduction control means reduces the tone assigned to the background area more than the tone assigned to the attention area.