JP2004363974A

JP2004363974A - Video encoding transmission device and video encoding amount control method

Info

Publication number: JP2004363974A
Application number: JP2003160261A
Authority: JP
Inventors: Masaki Sato; 正樹佐藤; Kazuya Takagi; 一也高木
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2003-06-05
Filing date: 2003-06-05
Publication date: 2004-12-24

Abstract

<P>PROBLEM TO BE SOLVED: To conduct efficient video transmission while increasing the viewing property of important regions on a wide-angle video without applying encoding transmission to the entire wide-angle video in a video encoding transmission device encoding and transmitting the video obtained by photographing with high resolution the wide-angle region. <P>SOLUTION: The video encoding transmission device comprises: a video segmentation section 103 segmenting a first region and a plurality of second regions from the wide-angle video; a main region processing section 104 extracting a non-important region from the first region; a main region encoding section 105 conducting encoding processing for the first region; an auxiliary region processing section 107 extracting the important regions from the second regions; an auxiliary region encoding section 108 conducting encoding processing for the important regions; and an encoding amount control section 106 deciding the important region to be superimposed on the non-important region and assigning encoding amounts to the first region and the important region. The video formed by encoding the important region on the auxiliary region is superimposed on the non-important region on a main encoded region for transmission. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、広角領域を高解像度で撮影した映像を、符号化および伝送する映像符号化伝送装置に関するものである。
【０００２】
【従来の技術】
従来、広角カメラを利用した映像監視システムや遠隔モニタリングシステム等に用いられる映像処理装置では、ユーザがカメラから入力された広角映像領域から任意の映像領域を選択すると、映像処理装置が選択領域を受信端末に伝送し、受信端末は表示能力に合わせて画素密度変換処理などを施したのちに選択領域を表示する方法を使用している（特許文献１参照）。
【０００３】
図１５は、特許文献１の実施の形態を説明する図である。図１５では、ユーザが、受信端末のユーザインタフェースにより、カメラから入力されたカメラ撮影領域１５０１から任意の選択領域１５０３を選択する。映像処理装置は、ユーザからの指示に合わせて、選択領域１５０３をカメラ撮影領域から切り出し受信端末に伝送する。受信端末は端末の表示領域に合わせて画素密度変換を行う。これにより、ユーザは、所望する映像領域を受信端末上に表示画面１５０２として表示できる。
【０００４】
この構成により、ユーザがカメラ撮影領域中の任意の領域（選択領域１５０３）を変更することにより、受信端末により表示される方位およびズーム倍率を変更できることになる。
【０００５】
また、特許文献２では、２画面表示機能を備えたテレビ電話装置に関し、親画面中の特定領域（人物の顔や文字領域など）を検出し、その結果に応じて子画面又は親画面の位置／サイズ／表示優先度を制御する方法について開示されている。
【０００６】
図１６は、特許文献２の実施の形態を説明する図である。図１６には、特許文献２を適用前のテレビ電話の表示画面１６０１と、適用後のテレビ電話の表示画面１６０４を示している。適用前の表示画面１６０１では、親画面領域１６０２中の特定領域（この例では通話相手の顔領域）と、子画面領域１６０３（この例では自画像）の一部が重なり合っている。一方、適用後の表示画面１６０４では、親画面領域１６０２の移動やサイズ変更を行った親画面領域１６０５と、子画面領域１６０３は重なり合っていない。このように、親画面あるいは子画面の位置／サイズ／表示優先度を制御することで、特定の領域が重なり合うことを防ぐことが可能となる。
【０００７】
また、特許文献３では、２画面表示機能を備えたテレビ電話装置に関し、子画面の表示位置／サイズを考慮した上で、親画面を符号化する方法について開示されている。
【０００８】
この例では、子画面により隠されることになる親画面領域の符号量を完全に０にすることなく一定量の符号量を割り当てることで、子画面の位置や大きさの変更、あるいは子画面の消去などを行った際に、画像の無い部分を生じてしまうという問題に対する対処を行っている。
【０００９】
具体的な動作としては、自テレビ電話装置は、相手側のテレビ電話装置より、子画面（相手の自画像）の位置やサイズに関する情報を取得し、自テレビ電話装置より送信する映像（自画像）から、相手側端末の子画面が重なり合う領域を決定する。自テレビ電話装置は、決定した領域に対する符合量を、他の同じ面積の領域よりも小さく割り当て符号化し自画像の送信を行う。これにより、子画面が重なる領域の符号量を少なくすることで一画面あたりの符号量を抑えつつ、子画面の表示位置や大きさの変更などによる画質の劣化を少なくできる。
【００１０】
【特許文献１】
特開平８−２３７５９０号公報（第６図）
【特許文献２】
特開平１０−２００８７３号公報（第６図〜第９図）
【特許文献３】
特開２００１−４５４７６号公報（第１図）
【００１１】
【発明が解決しようとする課題】
しかしながら、近年、ＣＣＤ等の受光素子の高解像度化および撮影領域の広角化の進展により、広角カメラにより撮影された映像データは、通常の画角のカメラに比べて大きく、１画面あたり数メガバイト以上と非常に大きなものとなってきており、限られた伝送帯域を使用して撮影された映像を伝送するためには、従来の手法では不十分である。
【００１２】
特許文献１で開示されている方法を用い、選択領域１５０３を広角カメラにより撮影されたカメラ撮影領域１５０１と同じにし、限られた伝送帯域に見合うように非常に高い圧縮率で選択領域１５０３の映像を圧縮して伝送することは可能である。しかしながら、この手法では、広角カメラの映像を高い圧縮率で圧縮する必要があり、映像受信端末により受信、表示された映像の品質は低いものとなってしまう。また、映像受信端末で受信、表示される映像の品質を高めるために、選択領域１５０３をカメラ撮影領域１５０１全体ではなく、ユーザがその一部を一定のサイズ以下（一定以上の画質を確保するため）で選択しながら、低い圧縮率で圧縮しながら伝送することも可能である。しかしながら、この場合には、選択領域１５０３以外のカメラ撮影領域１５０１の映像を、ユーザは映像受信端末で見ることができず、広角カメラで撮影した映像データが無駄になる。
【００１３】
また、特許文献２では、テレビ電話端末において、通話相手から送信されてくる映像の顔や文字などの特定領域と、自画像が重ならないように、相手映像あるいは自画像の位置／サイズなどを変更するための方法を開示しているが、電話網という固定帯域の伝送網を対象とした方法であり、相手から送信されてくる映像は固定的な伝送帯域を占有できることを前提としている。したがって、伝送帯域が変動するような伝送網へそのまま適用することはできない。また、開示されている方法は相手画像と自画像という関係を対象とした方法であって、親画像と複数の子画像を同時に伝送するという場合に、そのまま適用することは困難である。
【００１４】
また、特許文献３では、テレビ電話端末において、通話相手側の子画面表示位置を取得し、自画像中で相手側の子画面が重ね合わされると予想される位置の符号量を低く抑えることで一画面の符号量を抑える方法を開示しているが、特許文献２と同様に電話網という固定帯域の伝送網を対象とした方法であり、送信する映像は固定的な伝送帯域を占有できることを前提としている。したがって、伝送帯域が変動するような伝送網へそのまま適用することはできない。また、開示されている方法は、自画像と相手側の子画面の重ね合わせを考慮した方法であって、自端末で親画面と複数の子画面を同時に伝送するという場合に、そのまま適用することは困難である。
【００１５】
本発明は、上記のような課題を解消するためになされたものであって、広角カメラ映像全体を符号化伝送することなく、広角映像中の重要領域の一覧性を高めながら、効率的な情報伝送を可能とする映像符号化伝送装置を提供することを目的とする。
【００１６】
【課題を解決するための手段】
この課題を解決するために本発明は、広角映像から第一の領域と複数の第二の領域の切り出しを行う映像切り出し部、第一の領域から非重要領域の抽出を行う主領域処理部、第一の領域の符号化処理を行う主領域符号化部、第二の領域から重要領域の抽出を行う補助領域処理部、重要領域の符号化処理を行う補助領域符号化部、非重要領域に重ね合わせる重要領域を決定し、第一の領域および重要領域に符号量の割り当てを行う符号量制御部を備えたものである。
【００１７】
これにより、高解像度な広角カメラによって撮影された映像を、伝送帯域の変動あるいは重要度に合わせて符号化伝送の基本部分となる主領域と主領域以外の領域である補助領域とに分割し、主領域中から非重要領域を選択して、その領域に割り当てる符号量を低減するとともに、補助領域中の重要領域を符号化した映像を、符号化した主領域中の非重要領域上に重ねあわせることによって、補助領域中の重要領域を符号化した映像を、符号化した主領域中の非重要領域上に重ね合わせることが可能となり、広角カメラ映像全体を符号化伝送することなく、広角映像中の重要領域の一覧性を高めながら、効率的に映像伝送できる、という効果が得られる。
【００１８】
さらに本願発明は、映像切り出し部が、第一の特定領域を多く含むように第一の領域を決定することを特徴とする。これにより、第一の領域を主要部として効果的に選択することができる。
【００１９】
さらに、第一の特定領域が、所定のサイズ以下でかつ動きの大きな映像領域を多く含む領域であることにより、活発に移動する被写体をより多く含む領域を主要部として効果的に選択でき、また、第一の特定領域が、所定のサイズ以下でかつ人物の顔を多く含む映像領域を多く含む領域により、例えば監視映像などで人物を特定したい映像を伝送したいときなど、より効果的に主要部を選択することができる。
【００２０】
さらに、第一の領域より非重要領域を、動きが小さかったり、人物の顔などが含まれない領域から、第二の領域より重要領域を、動きが大きかったり、人物の顔などが含まれている領域から選ぶことにより、表示画面や伝送帯域に制限がある受信側に対して、広角映像の全体を高画質で伝送することなく、重要な情報を欠損させることなく、符号量を調節して伝送することができる。
【００２１】
さらに、本願発明では、映像受信端末から送信される映像データを受信する伝送情報受信部をさらに具備することにより、ユーザの嗜好、および現状の伝送状況等に照らし合わせて、より動的に目標符号量を設定することができる。
【００２２】
【発明の実施の形態】
以下、本発明の実施の形態について、図１から図１４を用いて説明する。
【００２３】
（実施の形態１）
図１は本発明の第一の実施形態における映像符号化伝送装置の構成の一例を示す図である。図１において、１０１はＣＣＤ等の受光素子を備え広角を撮影できる広角カメラより広角映像を取得する映像入力部、１０２は映像入力部１０１により入力された映像を一時的に記憶するための映像記憶部、１１１は本発明の映像符号化伝送装置の動作を決めるためのパラメータを記憶する動作パラメータ記憶部、１０３は動作パラメータ記憶部１１１に記憶されているパラメータに基づき、映像記憶部１０２に記憶されている広角映像から符号化伝送の基本部分である主領域および主領域以外の領域である補助領域の切り出しを行う映像切り出し部、１０４は映像切り出し部１０３により切り出された主領域から非重要領域の抽出を行う主領域処理部、１０５は主領域処理部１０４より出力される主領域映像の符号化処理を行う主領域符号化部、１０７は映像切り出し部１０３により切り出された補助領域から重要領域の抽出を行う補助領域処理部、１０８は補助領域処理部１０７より出力される重要領域映像の符号化処理を行う補助領域符号化部、１０６は動作パラメータ記憶部１１１に記憶された動作パラメータ、主領域処理部１０４より取得する非重要領域に関する情報、補助領域処理部１０７より取得する重要領域に関する情報、主領域符号化部１０５より取得する符号量、および補助領域符号化部１０８より取得する符号量に基づき非重要領域や重要領域を決定して各領域に符号量の割り当てを行う符号量制御部１０６、１０９は主領域符号化部１０５および補助領域符号化部１０８から出力される符号化映像データを多重化する映像多重化部、１１０は映像多重化部１０９により主領域および補助領域の映像が多重化された映像データをネットワーク環境に合わせてパケット化して伝送を行う映像送信部である。
【００２４】
図２は、本発明の第一の実施形態における映像符号化伝送装置の利用形態の一例を示す図である。２０１−１、２０１−２は本発明の映像符号化伝送装置、２０２は映像符号化伝送装置２０１−１、２０１−２より送信される符号化映像を受信し復号して画面に表示するための映像受信端末、２０３は映像符号化装置２０１−１、２０１−２より送信される符号化映像を蓄積する映像蓄積装置、２０４は映像符号化装置２０１−１、２０１−２、映像受信端末２０２、映像蓄積装置２０３を接続するネットワークである。ここでネットワーク２０４は、ＬＡＮやＷＡＮなどのＩＰネットワークのみならず、ＡＴＭ等の他の伝送方式のネットワークを使用しても良い。また、映像受信端末２０２は、本発明の映像符号化伝送装置により使用される符号化方式（例えばＭＰＥＧ４）により符号化された映像データの復号部を持つ必要があるが、一般的に使用されているＭＰＥＧ４デコーダを用いればよい。
【００２５】
以上のように構成された映像符号化伝送装置について、以下その動作を説明する。
【００２６】
最初に本発明の映像符号化伝送装置の動作概要を、図３および図４を用いて説明する。図３は本発明の第一の実施形態における映像符号化伝送装置が処理対象とする広角映像領域の一例を示す図である。本発明の映像符号化伝送装置では、図３に示すような広角映像領域（座標（０，０）、（ｘ、０）、（０、ｙ）、（ｘ、ｙ）で定義される）を入力として、その広角映像領域を、符号化伝送する基本部分である主領域３０１（座標（ａ、０）、（ｂ、０）、（ａ、ｙ）、（ｂ、ｙ）で定義される）とそれ以外の領域である補助領域３０２−１（座標（０，０）、（ａ、０）、（０、ｙ）、（ａ、ｙ）で定義される）および補助領域３０２−２（（ｂ、０）、（ｘ、０）、（ｂ、ｙ）、（ｘ、ｙ）で定義される）とに分割する。この図では、主領域一つ、補助領域二つに分割しているが、この分割数は自由に変更可能である。二種類の領域に分割したのち、主領域３０１内で非重要な領域を検出し、低画質化領域とする。図３では、低画質化領域３０４−１（座標（ｅｘ１、ｅｙ１）、（ｆｘ１、ｆｙ１）で定義される）および低画質化領域３０４−２（座標（ｅｘ２、ｅｙ２）、（ｆｘ２、ｆｙ２）で定義される）として図示されている。補助領域３０２−１、３０２−２内では、重要な領域を検出し、子画面領域とする。図３では、子画面領域３０３−１（座標（ｃｘ１、ｃｙ１）、（ｄｘ１、ｄｙ１）で定義される）および子画面領域３０３−２（座標（ｃｘ２、ｃｙ２）、（ｄｘ２、ｄｙ２）で定義される）として図示されている。主領域内での低画質化領域および補助領域内での子画面領域を検出後、主領域上の低画質化領域を低画質に符号化した上に、補助領域中の子画面領域を符号化した映像を重ね合わせて伝送を行う。具体例を図４に示す。図４は本発明の第一の実施形態における映像符号化伝送装置の送信映像の一例を示す図であり、図３に示す広角映像領域から生成、符号化された映像である。主領域４０１（座標（ａ、０）、（ｂ、０）、（ａ、ｙ）、（ｂ、ｙ）で定義される）は主領域３０１と同一であり、子画面領域４０２−１（座標（ｅｘ１、ｅｙ１）、（ｆｘ１、ｆｙ１）で定義される）、４０２−２（座標（ｅｘ２、ｅｙ２）、（ｆｘ２、ｆｙ２）で定義される）は、それぞれ子画面領域３０３−１、３０３−２に対応する。
【００２７】
次に本発明の映像符号化伝送装置の詳細な動作を、図５から図１１を用いて説明する。
【００２８】
まず、映像入力部１０１はＣＣＤ等の受光素子を備える広角カメラから広角映像を１フレーム分取得し、ＤＲＡＭやフラッシュメモリなどのメモリ上にある映像記憶部１０２へ蓄積する。各機能ブロックは、１フレーム単位で処理を行う。また、動作パラメータ記憶部１１１は、本発明の映像符号化伝送装置に対して、映像の符号化伝送を開始する前に、初期情報としてユーザが直接、あるいは映像受信端末２０２や映像蓄積装置２０３が、本装置に与える「伝送映像目標符号量（本発明の装置が符号化する際の目標となる符号量）」、「初期ネットワーク伝送帯域」、「受信可能映像サイズ（映像受信端末２０２における表示能力）」、「フレームレート（広角カメラから映像を取得する毎秒あたりのフレーム数）」などの初期パラメータが格納している。これらの初期パラメータの映像受信端末２０２や映像蓄積装置２０３との受渡し方法については、広く用いられている方法（ＳＩＰ＋ＳＤＰ、ＲＴＳＰ＋ＳＤＰなど）により行うことが可能である。
【００２９】
次に、映像記憶部１０２に広角映像が蓄積されると、映像切り出し部１０３が図５に示すフローチャートに従って動作する。図５は、本発明の第一の実施形態における映像符号化装置の映像切り出し部の動作を示すフローチャートである。
【００３０】
映像切り出し部１０３の動作は、ステップ５０１〜ステップ５０４からなっている。
【００３１】
ステップ５０１の動き検出ステップでは、映像記憶部１０２に蓄積されている広角映像から、過去の映像と比較して動きのあるブロックを検出する。動きの検出に関しては、世の中で広く使用されているＭＰＥＧ２，ＭＰＥＧ４などの符号化方式で使用している動き検出方法を用いる。
【００３２】
図６は本発明の第一の実施形態における映像符号化伝送装置のブロック単位の動き検出を説明する図である。図６を用いて、動き検出ステップ５０１について簡単に説明する。カメラで撮影された映像領域６０１を一定数の画素からなる動き検出ブロック６０２に分割し、動き検出ブロック単位で処理を行う。ＭＰＥＧ２、ＭＰＥＧ４ではこの動き検出ブロック６０２をマクロブロックと呼んでおり、例えば１６×１６の画素からなっている。各々の動き検出ブロック６０２について、過去の映像領域中の動き検出ブロックと比較し、類似する動き検出ブロックを検索する。類似する動き検出ブロックを発見すると、現在の動き検出ブロック６０２の位置へ移動させるために必要な動きベクトル６０３を求める。動きベクトル６０３は、例えばｘ方向に１００画素、ｙ方向に２００画素という形式で表現するが、動き検出ブロック単位でも良いし、あるいは、４０度方向（映像領域のｘ軸を基準（０度）として時計回りの角度として表現）に１５０画素という表現でも同様である。以降では、角度と画素数の組み合わせで動きベクトルを表現するものとして説明を行う。類似する動き検出ブロックが発見できない場合には、当該動き検出ブロック６０２の動きベクトル６０３は角度０度、画素数∞とする。
【００３３】
ステップ５０２の選択領域の決定ステップでは、ステップ５０１で求めた動きベクトル情報６０３を利用して選択領域を選択する。動き検出ブロック６０２を連結したものが選択領域６０４−１、６０４−２である。選択領域６０４−１、６０４−２の選択は、次のように行う。まず、各動き検出ブロック６０２の動きベクトル６０３を調べ、ベクトルの差（角度と画素数）がある閾値（例えば１０％）内のベクトルを求める。動きベクトルの差が閾値以下の動き検出ブロック中で、連結するブロックをそれぞれ集めて、選択領域６０４−１、６０４−２とする。選択領域６０４−１、６０４−２が矩形にならない場合には、近傍の動き検出ブロックを連結することで、矩形になるように変更する。選択領域は図１０に示す選択領域管理テーブルにより管理を行う。図１０は、本発明の第一の実施形態における映像符号化伝送装置の選択領域管理テーブルの一例を示す図である。選択領域管理テーブルは、選択領域を識別するための「領域ＩＤ」、選択領域の左上の動き検出ブロック６０２の「左上座標」、選択領域の右下の動き検出ブロック６０２の「右下座標」、選択領域内の動き検出ブロック６０２の動きベクトル６０３の「平均動き量」（単位は画素数）、選択領域内の動き検出ブロック６０２の動きベクトル６０３の「平均動き方向」（単位は度）からなっている。例えば、領域ＩＤが１の選択領域は、左上座標（ｘａ１、ｙａ１）、右下座標（ｘｂ１、ｙｂ１）、平均動き量１０画素、平均動き方向３０度である。
【００３４】
ステップ５０３の主領域および補助領域の決定ステップでは、選択領域管理テーブルの検索をし、平均動き量の大きな選択領域を可能な限り多く含み、かつ、動作パラメータ記憶部１１１に蓄積されている「受信可能映像サイズ」を超えないように主領域を決定する。決定した主領域が図３に示すようになったものとして、以降の説明を行う。広角映像領域から主領域３０１以外の領域を、補助領域３０２−１、３０２−２とする。
【００３５】
ステップ５０４の主領域および補助領域の切り出しステップでは、ステップ５０３で決定した、主領域３０１、補助領域３０２−１、３０２−２を広角映像領域から切り出し、それぞれ主領域処理部１０４および補助領域処理部１０７へ送り出す。また、同時に選択領域管理テーブルも主領域処理部１０４および補助領域処理部１０７へ送り出す。
【００３６】
次に、図８を用いて主領域処理部１０４の動作を説明する。図８は、本発明の第１の実施形態における映像符号化伝送装置の主領域処理部の動作を示すフローチャートである。
【００３７】
主領域処理部１０４の動作は、ステップ８０１とステップ８０２からなっている。
【００３８】
ステップ８０１の低画質化領域選択ステップでは、映像切り出し部１０３から受け取った選択領域管理テーブルを参照しながら、低画質化領域を選択する。低画質化領域とは、平均動き量の値がある閾値（例えば３２画素）以下の領域である。
【００３９】
ステップ８０２の低画質化領域管理テーブル作成ステップでは、ステップ８０１によって選択された低画質化領域の情報を一覧にした低画質化領域管理テーブルを作成する。低画質化領域管理テーブルの形式は、図１０に示す選択領域管理テーブルと同じものである。より具体的には、低画質化領域管理テーブルは、選択領域管理テーブルの中からステップ８０１で選択された低画質化領域の中から、領域サイズ（領域に含まれる総画素数）の大きい順に領域を選択して、順次各項目をコピーすれば作成可能である。また、映像切り出し部１０３から受け取った主領域３０１の映像データを、主領域符号化部１０５へと送り出す。
【００４０】
次に、図９を用いて補助領域処理部１０７の動作を説明する。図９は、本発明の第１の実施形態における映像符号化伝送装置の補助領域処理部の動作を示すフローチャートである。
【００４１】
補助領域処理部１０７の動作は、ステップ９０１とステップ９０２からなっている。
【００４２】
ステップ９０１の子画面領域選択ステップでは、映像切り出し部１０３から受け取った選択領域管理テーブルを参照しながら、子画面領域を選択する。子画面領域は、平均動き量の値が閾値（例えば３２０画素）以上の領域である。
【００４３】
ステップ９０２の子画面領域管理テーブル作成ステップでは、ステップ９０１によって選択された子画面領域の情報を一覧にした子画面領域管理テーブルを作成する。子画面領域管理テーブルの形式は、図１０に示す選択領域管理テーブルと同じものである。より具体的には、子画面領域管理テーブルは、選択領域管理テーブルの中からステップ９０１で選択された子画面領域の中から、領域サイズの大きい順に領域を選択して、順次各項目をコピーすれば作成可能である。また、映像切り出し部１０３から受け取った補助領域３０２−１、３０２−２の映像データを、補助領域符号化部１０７へと受け渡す。
【００４４】
次に、図７を用いて符号量制御部１０６の動作を説明する。図７は、本発明の第１の実施形態における映像符号化伝送装置の符号量制御部の動作を示すフローチャートである。
【００４５】
符号量制御部１０６の動作は、ステップ７０１〜ステップ７１０からなっている。
【００４６】
ステップ７０１では、符号量補正値を０に初期化し、動作パラメータ記憶部１１１に格納されている「伝送映像目標符号量」を取得する。伝送映像目標符号量は、本装置が符号化伝送する映像データの目標符号量である。また、符号量補正値は、各フレームを符号化する際の目標符号量であるフレーム符合量を補正するための値である。
【００４７】
ステップ７０２のフレーム符号量設定ステップでは、動作パラメータ記憶部１１１に格納されている「伝送映像目標符号量」と「フレームレート」および符号量補正値からフレーム符号量を求める。ここでは、各フレームに対して同一の符号量を割り当てる場合には、例えば以下の（式１）に示す式によりフレーム符号量を設定する。
【００４８】
（式１）：フレーム符号量＝（「伝送映像目標符号量」÷「フレームレート」）−符号量補正値
なお、上記は、各フレーム同一の符号量割り当てを行ったが、フレーム毎に割当量を変更するようにしても良い。
【００４９】
ステップ７０３の主領域情報取得ステップでは、主領域処理部１０４から低画質化領域管理テーブルを取得する。
【００５０】
ステップ７０４の補助領域情報取得ステップでは、補助領域処理部１０７から子画面領域管理テーブルを取得する。
【００５１】
ステップ７０５の低画質化領域および子画面領域の決定と符号量割り当てステップでは、ステップ７０３で取得した低画質化領域管理テーブルおよびステップ７０４取得した子画面領域管理テーブルおよび動作パラメータ記憶部１１１に格納されている「伝送映像目標符号量」から、低画質化領域と子画面領域を決定し、各領域への符号量割り当てを行う。割り当て結果として、図１１に示す管理テーブルを作成する。
【００５２】
図１１は、本発明の第１の実施形態における映像符号化伝送装置の選択領域対応関係管理テーブルの一例を示す図である。選択領域対応関係管理テーブルは、低画質化領域を識別するための「低画質化領域ＩＤ」（低画質化領域管理テーブル中の「領域ＩＤ」と同一）、低画質化領域上に重ねあわされる子画面領域を識別するための「子画面領域ＩＤ」（子画面領域管理テーブル中の「領域ＩＤ」と同一）、低画質化領域の符号化する際の目標符合量となる「低画質化領域目標符号量」（単位はビット）、子画面領域を符号化する際の目標符合量となる「子画面領域目標符号量」（単位はビット）からなっている。例えば、図１１の１行目は、低画質化領域ＩＤ１の領域の上に、子画面領域ＩＤ２の領域を重ね、それぞれの領域の符号量が１０００、２００００ｂｉｔであることを示している。
【００５３】
具体的には、次の手順で選択領域対応管理テーブルを作成する。まず、低画質化領域管理テーブルから、領域サイズが大きい順に低画質化領域を一定個数（例えば３領域）選択する。低画質化領域が主領域中で占める割合が、大きくなり過ぎないように（例えば、占める割合が５０％以下）選択する。次に、子画面領域管理テーブルを参照して、領域サイズが大きい順に、先に選択した低画質化領域に重ねるように決める。ただし、重ね合わせる子画面領域の領域サイズが、隠される低画質化領域の領域サイズより大きくならないように決定する。
【００５４】
最後に、低画質化領域と子画面領域の各組に対して、以下の方法で符号量の割り当てを行う。まず、（式２）で示す式により、低画質化領域の旧割り当て符号量を求める。
【００５５】
（式２）：低画質化領域への旧割り当て符号量＝フレーム符号量×（低画質化領域の領域サイズ÷主領域の領域サイズ）
なお、上記は、各フレーム内で均一に符号量が割り当てられていることを前提としているが、フレーム内で均一でない場合に対応することも容易である。
【００５６】
次に、旧割り当て符号量を一定の比率（例えば１対９）で、低画質化領域の目標符号量と子画面領域の目標符号量を割り当てる。
【００５７】
また、低画質化領域以外の領域である通常領域の目標符号量の割り当てを、（式３）で示す式により行う。
【００５８】
（式３）：通常領域目標符号量＝フレーム符号量−（低画質化領域目標符号量の和）
ステップ７０６の主領域符号化要求ステップでは、低画質化領域管理テーブルおよびステップ７０５で作成した選択領域対応関係管理テーブルおよび通常領域目標符号量を、主領域符号化部１０５へ通知するとともに符号化の要求を行う。
【００５９】
ステップ７０７の補助領域符号化要求ステップでは、子画面領域管理テーブルおよびステップ５で作成した選択領域関係管理テーブルを、補助領域符号化部１０７へ通知するとともに符号化の要求を行う。
【００６０】
ステップ７０８の主領域結果符号量取得ステップでは、主領域符号化部１０５から、主領域１フレーム分を符号化した結果の符合量である主領域結果符合量を取得する。
【００６１】
ステップ７０９の補助領域結果符号量取得ステップでは、補助領域符号化部１０８から、重要領域１フレーム分を符号化した結果の符合量である補助領域結果符合量を取得する。
【００６２】
ステップ７１０の符号量補正値決定ステップでは、ステップ７０２で設定したフレーム符号量、ステップ７０８、７０９で取得した主領域結果符号量、補助領域結果符号量から、符号量補正値を決定する。符合量補正値の決定は、例えば（式４）に示す式により行う。
【００６３】
（式４）：符号量補正値＝フレーム符号量−（主領域結果符号量＋補助領域結果符号量）
符号量を使用しすぎた場合には、符号量補正値が負の値になり、逆に、余った場合には、正の値になる。符号量補正値を次フレームのフレーム符号量に反映する（使いすぎた場合には次フレームで使用する符合量を減らし、余った場合には次フレームで使用する符号量を増やす）ことにより、「伝送映像目標符号量」に近づけるように補正される。ステップ７０２へと戻る。
【００６４】
次に、主領域符号化部１０５は、主領域処理部１０４から受け取った主領域３０１のデータおよび符号量制御部１０６から受け取った低画質化領域管理テーブル、選択領域対応関係管理テーブルおよび通常領域（主領域中の低画質化領域以外の領域）目標符号量に基づき、主領域３０１の符号化を行う。符号化方式としては、ＭＰＥＧ４などの標準的なものを用いればよいが、これらに限らず各種方式を使用可能である。符号化の際には、低画質化領域への割り当て符号量および通常領域符号量を守るように符号化を行う。より具体的には、例えば、低画質化領域ＩＤ１、２、３で示される領域（主領域中のどこの領域であるかを示す領域情報は選択領域管理テーブルより得る）を、低画質化領域目標量１０００、１５００，２０００ｂｉｔでそれぞれ行いながら、通常領域目標符号量を守るように符号化する。その後、符号化した結果の符合量を、符合量制御部１０６へ通知する。
【００６５】
一般的に符号化の際に、大きな量子化ステップ幅で量子化すると情報量を大きく圧縮可能であるが、量子化誤差は大きくなり画質が低下する。逆に、小さな量子化ステップ幅で量子化すると情報量の圧縮量は小さくなるが、量子化誤差は小さくなり画質が向上する。低画質化領域のみ符号量を削減するために、通常領域の量子化ステップ幅よりも当該領域の量子化ステップ幅を大きくすれば良い。
【００６６】
次に、補助領域符号化部１０８は、補助領域処理部１０７から受け取った子画面領域３０３−１、３０３−２のデータおよび符号量制御部１０６から受け取った子画面領域管理テーブルおよび選択領域対応関係管理テーブルに基づき、子画面領域３０３−１、３０２−２の符号化を行う。より具体的には、例えば、子画面領域ＩＤ１、２、３で示される領域（補助領域中のどこの領域であるかを示す領域情報は選択領域管理テーブルより得る）を別々に、子画面領域目標符号量１２０００、２００００，１８０００ｂｉｔに近づくように符号化を行う。それ以外の領域については符号化を行わない。その後、符号化した結果の符合量を、符合量制御部１０６へ通知する。個々の子画面領域の符号化方式としては、主領域符号化部１０５と同様で、ＭＰＥＧ４などの標準的なものを用いればよいが、これらに限らず各種方式を使用可能である。
【００６７】
次に、映像多重化部１０９は、主領域符号化部１０５から出力される主領域３０１を符号化した映像データを１オブジェクト、補助領域符号化部１０８から出力される子画面領域３０３−１、３０３−２を符号化した映像データをそれぞれ１オブジェクトとして、一つの映像データとなるように多重化し、映像送信部１１０に送り出す。多重化方式は、ＭＰＥＧ４のオブジェクト符号化方式を用いればよいが、オブジェクト単位で多重化できる他の方式を用いても良い。
【００６８】
最後に、映像送信部１１０は、映像多重化部１０９から受け取った多重化映像データを、ネットワークに合わせてパケット化あるいはセル化などを行って、ネットワーク上に送出する。映像データの送出は、動作パラメータ記憶部１１１に格納されている「初期ネットワーク伝送帯域」を満たすような伝送レートで行う。
【００６９】
さらに、上記の説明では、すべての領域に関し矩形であるものとして説明を行っているが、複数のベクトルを用いることにより、領域を矩形以外の他の形状で表現することも可能である。
【００７０】
また、映像切り出し部１０３の動作フローチャートのステップ５０３で、平均動き量の大きな選択領域を可能な限り多く含むように主領域の選択を行っているが、選択領域を考慮せず固定的に主領域を決定することで、処理を低減しながら、本発明と同様の効果を得ることが可能である。
【００７１】
また、映像切り出し部１０３の動作フローチャートのステップ５０２の、選択領域の選択の際に、一定サイズ（例えば４動き検出ブロック）以上の領域のみ選択領域管理テーブルへ登録する。これにより、記憶領域の削減、以降の処理で不必要な可能性の高い領域の排除しながら、本発明と同様の効果を得ることが可能である。
【００７２】
また、映像切り出し部１０３、主領域処理部１０４、補助領域処理部１０７において、動きベクトルを利用して領域の重要度を判定しているが、例えば、顔画像認識技術と組み合わせて、人の顔が多く含まれる領域を重要領域と判断し、人の顔が含まれていない領域を非重要領域と判断することで、より効率的な映像伝送が可能となる。
【００７３】
以上のように、本実施の形態では、高解像度な広角カメラによって撮影された広角映像から、動きの大きな領域を多く含むように主領域を切り出し、さらに、主領域中から動きの小さな領域を低画質化領域として、また補助領域中から動きの大きな領域を子画面領域として選択して、各々を別オブジェクトとして符号化し、子画面領域を低画質化領域上に重ね合わせる構成を備えることにより、広角カメラ映像の一部分（主領域）を見ながらも、それ以外の領域（補助領域）における大きな変化を見逃すことを減らすことが可能となり、その実用的効果は大きい。
【００７４】
また、上記に加え、低画質化領域に割り当てる符号量を低減し、低減した符号量を子画面領域に割り当て符号化するような構成を備えることにより、限られた伝送帯域で広角カメラ映像の重要部分を伝送することが可能となり、その実用的効果は大きい。
【００７５】
また、低画質化領域への符号量を０にすることなく符号化しているため、子画面領域の表示を行わない場合でも、黒く何も表示されないということはなく、主観的な画質を高めることが可能となり、その実用的効果は大きい。
【００７６】
（実施の形態２）
図１２は、本発明の第二の実施形態における映像符号化伝送装置の構成の一例を示す図である。図１に示す第一の実施形態における本装置の構成と同じ作用を持つブロックには同一の符号を付している。第一の実施形態との違いは、本装置の動作パラメータを格納している動作パラメータ記憶部１２０１、映像受信端末２０２や映像蓄積装置２０３から映像データの伝送状態を受信する伝送情報受信部１２０２、多重化されたデータをネットワークに送信する映像送信部１２０３、広角映像領域から主領域あるいは補助領域の切り出しを行う映像切り出し部１２０４、符号量の制御を行う符号量制御部１２０５である。
【００７７】
各ブロックの動作について、第一の実施形態との相違点を説明する。
【００７８】
新規の機能ブロックである伝送情報受信部１２０２は、映像受信端末２０２や映像蓄積装置２０３から逐次送られてくる「現状ネットワーク伝送帯域」などの伝送情報を受信する。ネットワーク伝送帯域の変動状況を、映像受信端末２０２や映像蓄積装置２０３から本装置に送る方法は、ＲＴＣＰなどが利用できる。
【００７９】
動作パラメータ記憶部１２０２は、第一の実施形態で示した「伝送映像目標符号量」、「初期ネットワーク伝送帯域」、「受信可能映像サイズ」、「フレームレート」などの初期パラメータを格納し、さらに、伝送情報受信部１２０２により受信した「現状ネットワーク伝送帯域」などの変動パラメータを格納する。
【００８０】
映像送信部１２０３は、第一の実施形態と異なり、動作パラメータ記憶部１２０２に格納されている「現状ネットワーク伝送帯域」に合わせて、逐次伝送レートを変更する。
【００８１】
映像切り出し部１２０４は、第一の実施形態の作用に加え、「現状ネットワーク伝送帯域」も考慮して、主領域のサイズを変更する。伝送帯域が小さくなった場合には、主領域の領域サイズが小さくなるように（例えば１０％伝送帯域が減少したら、１０％主領域の領域サイズを減らす）、逆に、伝送帯域が大きくなった場合には主領域の領域サイズを大きくなるようにする。
【００８２】
符号量制御部１２０５は、第一の実施形態の作用に加え、「現状ネットワーク伝送帯域」も考慮して、低画質化領域の個数を変更する。伝送帯域が小さくなった場合には、低画質化領域の個数が少なくなるように（例えば１０％伝送帯域が減少したら、１領域分減らす）、逆に、伝送帯域が大きくなった場合には低画質化領域の個数が多くなるようにする。
【００８３】
さらに、映像受信端末２０２を使用しているユーザの主観的な好みを反映させるために、映像受信端末２０２から下記に示すパラメータを送信するようにしても良い。伝送情報受信部１２０２でそれらのパラメータを受信して、動作パラメータ１２０１に保存する。各機能ブロックは、保存されたパラメータを利用して、第一の実施形態あるいは第二の実施形態に示した動作を行う。
【００８４】
映像受信端末２０２から送信するパラメータは例えば、
・ステップ５０２の動きベクトルの差の閾値
・ステップ８０１の平均動き量の閾値
・ステップ９０１の平均動き量の閾値
・ステップ７０５の低画質化領域の個数
・ステップ７０５の符号量の割り当て比率
である。
【００８５】
以上のように、本実施の形態では、第一の実施形態の構成に加えて、映像受信端末からの伝送情報を受信し、ネットワークの伝送帯域変動に合わせて、主領域の領域サイズや低画質領域の選択方法を変更する構成を備えることにより、ネットワーク変動に応じた品質の広角カメラ映像をユーザに提示することが可能となり、その実用的効果は大きい。
【００８６】
また、上記に加え、ユーザによって映像受信端末から本発明の動作を決定する平気動き量の閾値や動きベクトルの差の閾値などを変更可能とする構成を備えることにより、ユーザの主観的な好みの映像をユーザに提示することが可能となり、その実用的効果は大きい。
【００８７】
（実施の形態３）
図１３は、本発明の第三の実施形態における映像符号化伝送装置の選択領域対応関係管理テーブルの一例を示す図である。第一の実施形態の選択領域対応管理テーブルに、「子画面領域変形フラグ」を追加している。「子画面領域変形フラグ」は、低画質化領域の領域サイズと、重ね合わせた子画面領域の領域サイズが異なる場合に、子画面領域の領域サイズを変形して低画質化領域と同じ領域サイズにするかどうかの判定フラグである。ＯＮの場合には変形を行って同一のサイズにし、ＯＦＦの場合には変形を行わない。このフラグの効果について、図１４を用いて説明する。図１４は、本発明の第３の実施形態における映像符号化伝送装置の送信映像の一例を示す図である。図１４は、主領域１４０１上に子画面領域１４０２−１、１４０２−２が重ねあわされている様子を表している。子画面領域１４０２−１は、「子画面領域変形フラグ」がＯＦＦになっており、低画質化領域１４０３−１と領域サイズが異なったままに表示されている。一方、子画面領域１４０２−２は、「子画面領域変形フラグ」がＯＮになっており、低画質化領域１４０３−２と領域サイズが同一になっている。
【００８８】
実際の変形処理は、符号量制御部１０６からの要求に合わせて補助領域処理部１０７により行ない、変形後の領域に対して補助領域符号化部１０８により符号化を行う。
【００８９】
以上のように、本実施の形態では、第一の実施の形態の選択領域対応管理テーブルに、低画質化領域の領域サイズと、重ね合わせた子画面領域の領域サイズが異なる場合に、子画面領域の領域サイズを変形して低画質化領域と同じ領域サイズにするかどうかの判定フラグである「子画面領域変形フラグ」を備えることにより、低画質領域の表示領域を無駄にすることなく子画面領域を表示することが可能となり、その実用的効果は大きい。
【００９０】
【発明の効果】
以上のように本発明によれば、高解像度な広角カメラによって撮影された映像を、ネットワークの伝送帯域の変動あるいは映像の動きの大きさに合わせて符号化伝送の基本部分となる主領域と主領域以外の領域である補助領域とに分割し、主領域中から非重要領域（低画質化領域）を選択して、その領域に割り当てる符号量を低減するとともに、補助領域中の重要領域（子画面領域）を符号化した映像を、符号化した主領域中の非重要領域上に重ねあわせる構成を備えたことにより、広角カメラ映像全体を符号化伝送することなく、広角映像中の重要領域の一覧性を高めながら、効率的な映像伝送が可能となる。
【図面の簡単な説明】
【図１】本発明の第１の実施形態における映像符号化伝送装置の構成の一例を示す図
【図２】本発明の第１の実施形態における映像符号化伝送装置の利用形態の一例を示す図
【図３】本発明の第１の実施形態における映像符号化伝送装置が処理対象とする広角映像領域の一例を示す図
【図４】本発明の第１の実施形態における映像符号化伝送装置の送信映像の一例を示す図
【図５】本発明の第１の実施形態における映像符号化伝送装置の映像切り出し部の動作の一例を示すフローチャート
【図６】本発明の第１の実施形態における映像符号化伝送装置のブロック単位の動き検出を説明する図
【図７】本発明の第１の実施形態における映像符号化伝送装置の符号量制御部の動作の一例を示すフローチャート
【図８】本発明の第１の実施形態における映像符号化伝送装置の主領域処理部の動作の一例を示すフローチャート
【図９】本発明の第１の実施形態における映像符号化伝送装置の補助領域処理部の動作の一例を示すフローチャート
【図１０】本発明の第１の実施形態における映像符号化伝送装置の選択領域管理テーブルの一例を示す図
【図１１】本発明の第１の実施形態における映像符号化伝送装置の選択領域対応関係管理テーブルの一例を示す図
【図１２】本発明の第２の実施形態における映像符号化伝送装置の構成の一例を示す図
【図１３】本発明の第３の実施形態における映像符号化伝送装置の選択領域対応関係管理テーブルの一例を示す図
【図１４】本発明の第３の実施形態における映像符号化伝送装置の送信映像の一例を示す図
【図１５】従来の映像処理装置により伝送される映像を説明する図
【図１６】テレビ電話装置で表示される画面を示す図
【符号の説明】
１０１映像入力部
１０２映像記憶部
１０３映像切り出し部
１０４主領域処理部
１０５主領域符号化部
１０６符号量制御部
１０７補助領域処理部
１０８補助領域符号化部
１０９映像多重化部
１１０映像送信部
１１１動作パラメータ記憶部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a video encoding and transmitting apparatus that encodes and transmits an image obtained by shooting a wide-angle area at a high resolution.
[0002]
[Prior art]
Conventionally, in a video processing device used for a video monitoring system or a remote monitoring system using a wide-angle camera, when a user selects an arbitrary video region from a wide-angle video region input from a camera, the video processing device receives the selected region. A method of transmitting a selected area to a terminal and displaying a selected area after performing a pixel density conversion process or the like in accordance with a display capability is used for the receiving terminal (see Patent Document 1).
[0003]
FIG. 15 is a diagram illustrating an embodiment of Patent Document 1. In FIG. 15, the user selects an arbitrary selection area 1503 from the camera shooting area 1501 input from the camera by using the user interface of the receiving terminal. The video processing device cuts out the selection area 1503 from the camera shooting area and transmits it to the receiving terminal in accordance with an instruction from the user. The receiving terminal performs pixel density conversion according to the display area of the terminal. Thus, the user can display a desired video area on the receiving terminal as the display screen 1502.
[0004]
With this configuration, the user can change the azimuth and the zoom magnification displayed by the receiving terminal by changing an arbitrary area (selection area 1503) in the camera shooting area.
[0005]
Patent Document 2 discloses a videophone device having a two-screen display function, in which a specific area (such as a person's face or a character area) in a main screen is detected, and a position of a child screen or a main screen is determined in accordance with the result. A method for controlling / size / display priority is disclosed.
[0006]
FIG. 16 is a diagram illustrating an embodiment of Patent Document 2. FIG. 16 shows a display screen 1601 of a videophone before application of Patent Document 2 and a display screen 1604 of a videophone after application. In the display screen 1601 before application, a specific area in the parent screen area 1602 (the face area of the other party in this example) and a part of the child screen area 1603 (the self-portrait in this example) overlap. On the other hand, in the display screen 1604 after application, the parent screen area 1605 in which the parent screen area 1602 has been moved or the size has been changed does not overlap the child screen area 1603. Thus, by controlling the position / size / display priority of the parent screen or the child screen, it is possible to prevent specific areas from overlapping.
[0007]
Patent Document 3 discloses a method of encoding a parent screen in consideration of a display position / size of a child screen with respect to a videophone device having a two-screen display function.
[0008]
In this example, by assigning a fixed amount of code amount without completely setting the code amount of the parent screen area to be hidden by the child screen, the position and size of the child screen can be changed, or the child screen can be changed. In order to deal with the problem that a portion having no image is generated when erasing or the like is performed.
[0009]
As a specific operation, the own videophone device obtains information on the position and size of the sub-screen (self image of the other party) from the videophone device of the other party, and obtains information from the video (self image) transmitted from the own videophone device. Then, an area where the child screens of the other terminal overlap each other is determined. The own videophone apparatus assigns a code amount for the determined area smaller than that of the other areas having the same area, and transmits the self-picture. As a result, it is possible to reduce the code amount in an area where the child screens overlap, thereby reducing the code amount per screen, and reduce the deterioration in image quality due to a change in the display position or size of the child screen.
[0010]
[Patent Document 1]
JP-A-8-237590 (FIG. 6)
[Patent Document 2]
JP-A-10-200873 (FIGS. 6 to 9)
[Patent Document 3]
JP 2001-45476 A (FIG. 1)
[0011]
[Problems to be solved by the invention]
However, in recent years, the image data captured by a wide-angle camera is larger than that of a camera having a normal angle of view, and is several megabytes or more per screen due to the progress of higher resolution of a light receiving element such as a CCD and widening of a shooting area. Therefore, the conventional method is insufficient for transmitting a video shot using a limited transmission band.
[0012]
Using the method disclosed in Patent Literature 1, the selection area 1503 is made the same as the camera shooting area 1501 shot by the wide-angle camera, and the image of the selection area 1503 is displayed at a very high compression ratio so as to match a limited transmission band. Can be transmitted after compression. However, in this method, it is necessary to compress the image of the wide-angle camera at a high compression rate, and the quality of the image received and displayed by the image receiving terminal is low. In addition, in order to improve the quality of video received and displayed by the video receiving terminal, the user does not select the entire area 1503 of the selection area 1503 but a part of the camera shooting area 1501 below a certain size (in order to ensure image quality above a certain level). ), It is also possible to transmit while compressing at a low compression ratio. However, in this case, the user cannot see the video in the camera shooting area 1501 other than the selection area 1503 with the video receiving terminal, and the video data shot with the wide-angle camera is wasted.
[0013]
Also, in Patent Literature 2, in a videophone terminal, the position / size of the other party's image or self image is changed so that the self image does not overlap a specific area such as a face or character of the image transmitted from the other party. However, this method is intended for a fixed-bandwidth transmission network called a telephone network, and presupposes that a video transmitted from the other party can occupy a fixed transmission band. Therefore, it cannot be directly applied to a transmission network in which the transmission band fluctuates. Also, the disclosed method is a method targeting the relationship between the partner image and the self image, and it is difficult to directly apply the method when transmitting a parent image and a plurality of child images simultaneously.
[0014]
Further, in Patent Literature 3, in a videophone terminal, a small-screen display position of a communication partner is acquired, and the code amount of a position where the small-screen of the other party is expected to be superimposed in the self-portrait is suppressed to be low. Although a method for suppressing the code amount of the screen is disclosed, as in Patent Document 2, it is a method for a transmission network of a fixed band called a telephone network, and it is assumed that a video to be transmitted can occupy a fixed transmission band. And Therefore, it cannot be directly applied to a transmission network in which the transmission band fluctuates. Further, the disclosed method is a method that considers the superposition of the self image and the child screen of the other party, and can be applied as it is when transmitting the parent screen and a plurality of child screens simultaneously at the own terminal. Have difficulty.
[0015]
SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problem, and it is possible to efficiently view important areas in a wide-angle image without encoding and transmitting the entire wide-angle camera image while efficiently displaying information. It is an object of the present invention to provide a video coded transmission device capable of transmission.
[0016]
[Means for Solving the Problems]
In order to solve this problem, the present invention is a video cutout unit that cuts out a first region and a plurality of second regions from a wide-angle image, a main region processing unit that extracts an insignificant region from the first region, The main region encoding unit that performs the encoding process of the first region, the auxiliary region processing unit that extracts the important region from the second region, the auxiliary region encoding unit that performs the encoding process of the important region, It is provided with a code amount control unit that determines an important region to be overlapped and allocates a code amount to the first region and the important region.
[0017]
Thereby, the image captured by the high-resolution wide-angle camera is divided into a main region that is a basic portion of the coded transmission and an auxiliary region that is a region other than the main region according to the fluctuation or importance of the transmission band, Selecting an insignificant area from the main area, reducing the amount of code allocated to the area, and superimposing a video obtained by encoding the important area in the auxiliary area on the insignificant area in the encoded main area. This makes it possible to superimpose the image obtained by encoding the important area in the auxiliary area on the non-important area in the encoded main area, and to encode and transmit the entire wide-angle camera image without encoding the entire wide-angle camera image. The effect is that video can be transmitted efficiently while enhancing the visibility of important areas.
[0018]
Further, the invention of the present application is characterized in that the video clipping unit determines the first area so as to include many first specific areas. Thereby, the first region can be effectively selected as the main part.
[0019]
Furthermore, since the first specific region is a region that is smaller than or equal to a predetermined size and includes many moving image regions, a region including more actively moving subjects can be effectively selected as a main part, and The first specific area is smaller than a predetermined size and includes an image area including a large number of human faces. Can be selected.
[0020]
Furthermore, from the area where the movement is smaller than the first area, the movement is small, or the area where the face of the person is not included, the area more important than the second area, the movement is larger, or the face of the person is included. By selecting from the available areas, it is possible to adjust the code amount without transmitting the entire wide-angle image with high image quality, without losing important information, to the receiving side with limited display screen and transmission band. Can be transmitted.
[0021]
Further, according to the present invention, by further including a transmission information receiving unit that receives video data transmitted from the video receiving terminal, the target code can be more dynamically adjusted in view of the user's preferences, the current transmission status, and the like. The quantity can be set.
[0022]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to FIGS.
[0023]
(Embodiment 1)
FIG. 1 is a diagram illustrating an example of a configuration of a video encoding and transmitting apparatus according to the first embodiment of the present invention. In FIG. 1, reference numeral 101 denotes a video input unit which includes a light receiving element such as a CCD and acquires a wide-angle video from a wide-angle camera capable of capturing a wide angle, and 102 denotes a video storage for temporarily storing a video input by the video input unit 101. Unit, 111 is an operation parameter storage unit for storing parameters for determining the operation of the video encoding transmission apparatus of the present invention, and 103 is stored in the video storage unit 102 based on the parameters stored in the operation parameter storage unit 111. A video clipping unit that cuts out a main area that is a basic part of encoding transmission and an auxiliary area that is an area other than the main area from a wide-angle video that has been encoded. A main area processing unit 105 for performing extraction; a main area code 105 for performing encoding processing of a main area video output from the main area processing unit 104 And 107, an auxiliary area processing unit that extracts an important area from the auxiliary area cut out by the video cutout unit 103, and 108, an auxiliary area coding that performs encoding processing of the important area video output from the auxiliary area processing unit 107 And 106, the operation parameters stored in the operation parameter storage unit 111, information on the non-important area obtained from the main area processing unit 104, information on the important area obtained from the auxiliary area processing unit 107, and information from the main area encoding unit 105. The code amount control units 106 and 109, which determine an insignificant region or an important region based on the obtained code amount and the code amount obtained from the auxiliary region coding unit 108 and allocate the code amount to each region, perform main region coding. A video multiplexing unit that multiplexes the coded video data output from the unit 105 and the auxiliary area coding unit 108; 09 by a video transmitting unit for transmitting by packetizing the combined video data image of the main area and the auxiliary area are multiplexed on the network environment.
[0024]
FIG. 2 is a diagram illustrating an example of a usage form of the video encoding transmission device according to the first embodiment of the present invention. 201-1 and 201-2 are the video coded transmission devices of the present invention, and 202 is for receiving and decoding the coded video transmitted from the video coded transmission devices 201-1 and 201-2 and displaying them on the screen. A video receiving terminal, 203 is a video storage device that stores coded video transmitted from the video coding devices 201-1 and 201-2, 204 is a video coding device 201-1, 201-2, a video receiving terminal 202, This is a network for connecting the video storage device 203. Here, the network 204 may use not only an IP network such as a LAN and a WAN, but also a network of another transmission method such as an ATM. Also, the video receiving terminal 202 needs to have a decoding unit for video data encoded by the encoding method (for example, MPEG4) used by the video encoding transmission device of the present invention. MPEG4 decoder may be used.
[0025]
The operation of the video coded transmission device configured as described above will be described below.
[0026]
First, an outline of the operation of the video encoding and transmitting apparatus of the present invention will be described with reference to FIGS. FIG. 3 is a diagram illustrating an example of a wide-angle video area to be processed by the video encoding and transmitting apparatus according to the first embodiment of the present invention. In the video coded transmission apparatus of the present invention, a wide-angle video area (defined by coordinates (0, 0), (x, 0), (0, y), (x, y)) as shown in FIG. As an input, the wide-angle image area is defined as a main area 301 (defined by coordinates (a, 0), (b, 0), (a, y), (b, y)) which is a basic part for encoding and transmitting. And an auxiliary area 302-1 (defined by coordinates (0, 0), (a, 0), (0, y), (a, y)) and an auxiliary area 302-2 (( b, 0), (x, 0), (b, y), and (x, y)). In this figure, the area is divided into one main area and two auxiliary areas, but the number of divisions can be freely changed. After being divided into two types of regions, an insignificant region is detected in the main region 301 and is set as a low image quality region. In FIG. 3, the image quality reduction area 304-1 (defined by coordinates (ex1, ey1) and (fx1, fy1)) and the image quality reduction area 304-2 (coordinates (ex2, ey2), (fx2, fy2)) As defined in Figure 2). In the auxiliary areas 302-1 and 302-2, an important area is detected and set as a child screen area. In FIG. 3, the child screen area 303-1 (defined by coordinates (cx1, cy1), (dx1, dy1)) and the child screen area 303-2 (defined by coordinates (cx2, cy2), (dx2, dy2)) ). After detecting the low image quality area in the main area and the small screen area in the auxiliary area, the low image quality area in the main area is encoded with low image quality, and the small screen area in the auxiliary area is encoded. The transmission is performed by superimposing the images. A specific example is shown in FIG. FIG. 4 is a diagram illustrating an example of a transmission image of the video encoding transmission apparatus according to the first embodiment of the present invention, which is an image generated and encoded from the wide-angle image area illustrated in FIG. The main area 401 (defined by coordinates (a, 0), (b, 0), (a, y), and (b, y)) is the same as the main area 301, and has a small screen area 402-1 (coordinates). (Ex1, ey1), (fx1, fy1)) and 402-2 (defined by coordinates (ex2, ey2), (fx2, fy2)) are child screen areas 303-1 and 303-, respectively. Corresponds to 2.
[0027]
Next, a detailed operation of the video encoding transmission apparatus of the present invention will be described with reference to FIGS.
[0028]
First, the video input unit 101 acquires one frame of wide-angle video from a wide-angle camera having a light receiving element such as a CCD, and stores it in the video storage unit 102 on a memory such as a DRAM or a flash memory. Each functional block performs processing in units of one frame. In addition, the operation parameter storage unit 111 stores, as the initial information, the user directly or the video receiving terminal 202 or the video storage device 203 with the video encoding transmission apparatus of the present invention before starting the video encoding transmission. , The target code amount of transmission video (target code amount at the time of encoding by the device of the present invention), the “initial network transmission band”, and the receivable video size (display capability of the video receiving terminal 202). )) And “frame rate (the number of frames per second for acquiring an image from a wide-angle camera)”. The method of delivering these initial parameters to the video receiving terminal 202 or the video storage device 203 can be performed by a widely used method (SIP + SDP, RTSP + SDP, etc.).
[0029]
Next, when the wide-angle video is stored in the video storage unit 102, the video clipping unit 103 operates according to the flowchart shown in FIG. FIG. 5 is a flowchart illustrating the operation of the video clipping unit of the video encoding device according to the first embodiment of the present invention.
[0030]
The operation of the video clipping unit 103 includes steps 501 to 504.
[0031]
In the motion detection step of step 501, a block having motion is detected from the wide-angle video stored in the video storage unit 102 in comparison with a past video. For motion detection, a motion detection method used in encoding schemes widely used in the world, such as MPEG2 and MPEG4, is used.
[0032]
FIG. 6 is a diagram for explaining motion detection in block units of the video encoding and transmitting apparatus according to the first embodiment of the present invention. The motion detection step 501 will be briefly described with reference to FIG. An image area 601 captured by a camera is divided into motion detection blocks 602 each including a fixed number of pixels, and processing is performed in units of motion detection blocks. In MPEG2 and MPEG4, the motion detection block 602 is called a macroblock, and is composed of, for example, 16 × 16 pixels. Each motion detection block 602 is compared with a motion detection block in a past video area to search for a similar motion detection block. When a similar motion detection block is found, a motion vector 603 necessary for moving to the current position of the motion detection block 602 is obtained. The motion vector 603 is expressed, for example, in a format of 100 pixels in the x direction and 200 pixels in the y direction, but may be expressed in units of motion detection blocks, or in a direction of 40 degrees (with the x axis of the image area as a reference (0 degree). The same applies to the expression of 150 pixels (expressed as a clockwise angle). Hereinafter, description will be made assuming that a motion vector is expressed by a combination of an angle and the number of pixels. If a similar motion detection block cannot be found, the motion vector 603 of the motion detection block 602 has an angle of 0 degrees and the number of pixels ∞.
[0033]
In the selection area determination step of step 502, a selection area is selected using the motion vector information 603 obtained in step 501. Selection areas 604-1 and 604-2 are obtained by connecting the motion detection blocks 602. The selection of the selection areas 604-1 and 604-2 is performed as follows. First, the motion vector 603 of each motion detection block 602 is examined, and a vector within a threshold (for example, 10%) having a difference (angle and number of pixels) between the vectors is obtained. In the motion detection blocks in which the difference between the motion vectors is equal to or smaller than the threshold value, the connected blocks are collected and set as selected regions 604-1 and 604-2. If the selection areas 604-1 and 604-2 do not become rectangular, the neighboring areas are connected to each other to change to become rectangular. The selection area is managed by a selection area management table shown in FIG. FIG. 10 is a diagram illustrating an example of the selection area management table of the video encoding transmission device according to the first embodiment of the present invention. The selected area management table includes “area ID” for identifying the selected area, “upper left coordinate” of the upper left motion detection block 602 of the selected area, “lower right coordinate” of the lower right motion detection block 602 of the selected area, The “average motion amount” (unit is the number of pixels) of the motion vector 603 of the motion detection block 602 in the selected area, and the “average motion direction” (unit is degree) of the motion vector 603 of the motion detection block 602 in the selected area. ing. For example, the selected area having an area ID of 1 has upper left coordinates (xa1, ya1), lower right coordinates (xb1, yb1), an average motion amount of 10 pixels, and an average motion direction of 30 degrees.
[0034]
In the main area and auxiliary area determination step of step 503, the selection area management table is searched to include as many selected areas having a large average motion amount as possible and to store the “reception area” stored in the operation parameter storage unit 111. The main area is determined so as not to exceed the “possible video size”. The following description will be made assuming that the determined main area is as shown in FIG. Regions other than the main region 301 from the wide-angle image region are referred to as auxiliary regions 302-1 and 302-2.
[0035]
In the step of cutting out the main area and the auxiliary area in step 504, the main area 301 and the auxiliary areas 302-1 and 302-2 determined in step 503 are cut out from the wide-angle image area, and the main area processing unit 104 and the auxiliary area processing unit Send to 107. At the same time, the selected area management table is also sent to the main area processing unit 104 and the auxiliary area processing unit 107.
[0036]
Next, the operation of the main area processing unit 104 will be described with reference to FIG. FIG. 8 is a flowchart illustrating the operation of the main area processing unit of the video encoding and transmitting apparatus according to the first embodiment of the present invention.
[0037]
The operation of the main area processing unit 104 includes Step 801 and Step 802.
[0038]
In the image quality reduction area selection step of step 801, the image quality reduction area is selected with reference to the selection area management table received from the video clipping unit 103. The image quality reduction area is an area in which the value of the average motion amount is equal to or smaller than a certain threshold (for example, 32 pixels).
[0039]
In the image quality reduction area management table creation step in step 802, a image quality reduction area management table listing information on the image quality reduction areas selected in step 801 is created. The format of the image quality reduction area management table is the same as the selection area management table shown in FIG. More specifically, the image quality reduction area management table includes, from among the image quality reduction areas selected in step 801 in the selection area management table, areas in the order of area size (total number of pixels included in the area). Can be created by selecting and then copying each item in sequence. Further, the video data of the main area 301 received from the video clipping unit 103 is sent to the main area coding unit 105.
[0040]
Next, the operation of the auxiliary area processing unit 107 will be described with reference to FIG. FIG. 9 is a flowchart illustrating the operation of the auxiliary area processing unit of the video encoding and transmitting apparatus according to the first embodiment of the present invention.
[0041]
The operation of the auxiliary area processing unit 107 includes Step 901 and Step 902.
[0042]
In the child screen area selection step of step 901, a child screen area is selected with reference to the selection area management table received from the video clipping unit 103. The small screen area is an area where the value of the average motion amount is equal to or larger than a threshold value (for example, 320 pixels).
[0043]
In the child screen area management table creation step of step 902, a child screen area management table that lists information of the child screen areas selected in step 901 is created. The format of the child screen area management table is the same as the selection area management table shown in FIG. More specifically, the sub-screen area management table selects areas from the sub-screen areas selected in step 901 in the selected area management table in the order of area size, and copies each item in order. Can be created. The video data of the auxiliary areas 302-1 and 302-2 received from the video clipping unit 103 is transferred to the auxiliary area encoding unit 107.
[0044]
Next, the operation of the code amount control unit 106 will be described with reference to FIG. FIG. 7 is a flowchart illustrating an operation of the code amount control unit of the video encoding transmission device according to the first embodiment of the present invention.
[0045]
The operation of the code amount control unit 106 includes steps 701 to 710.
[0046]
In step 701, the code amount correction value is initialized to 0, and the “transmission video target code amount” stored in the operation parameter storage unit 111 is obtained. The transmission video target code amount is a target code amount of video data to be encoded and transmitted by the present apparatus. The code amount correction value is a value for correcting a frame code amount, which is a target code amount when encoding each frame.
[0047]
In the frame code amount setting step of step 702, the frame code amount is determined from the “transmission video target code amount”, the “frame rate”, and the code amount correction value stored in the operation parameter storage unit 111. Here, when the same code amount is assigned to each frame, the frame code amount is set by, for example, the following equation (Equation 1).
[0048]
(Equation 1): Frame code amount = (“transmission video target code amount” ÷ “frame rate”) − code amount correction value
In the above description, the same code amount is assigned to each frame, but the assigned amount may be changed for each frame.
[0049]
In the main area information acquisition step of step 703, the image quality reduction area management table is acquired from the main area processing unit 104.
[0050]
In the auxiliary area information obtaining step of step 704, a small screen area management table is obtained from the auxiliary area processing unit 107.
[0051]
In the step 705 of determining the image quality reduction area and the child screen area and assigning the code amount, the image quality reduction area management table acquired in step 703 and the child screen area management table acquired in step 704 and the operation parameter storage unit 111 are stored. The image quality reduction area and the child screen area are determined from the “transmission video target code amount”, and the code amount is allocated to each area. As a result of the assignment, a management table shown in FIG. 11 is created.
[0052]
FIG. 11 is a diagram illustrating an example of a selected area correspondence management table of the video encoding and transmitting apparatus according to the first embodiment of the present invention. The selected area correspondence management table is overlapped with the “low-quality area ID” for identifying the low-quality area (same as the “area ID” in the low-quality area management table). The “small screen area ID” for identifying the small screen area (same as the “area ID” in the small screen area management table), and the “low image quality area” which is the target code amount when encoding the low image quality area The target code amount (unit is bit) and the target code amount (unit is bit) which is the target code amount when encoding the small screen region. For example, the first line in FIG. 11 shows that the area of the child screen area ID2 is superimposed on the area of the image quality reduction area ID1, and the code amount of each area is 1000 and 20000 bits.
[0053]
Specifically, a selection area correspondence management table is created in the following procedure. First, a fixed number of image quality reduction areas (for example, three areas) are selected from the image quality reduction area management table in ascending order of area size. The selection is made such that the ratio of the low image quality region in the main region does not become too large (for example, the ratio of the low image quality region is 50% or less). Next, with reference to the child screen area management table, it is determined that the area size is superimposed on the previously selected image quality reduction area in order of increasing area size. However, the area size of the child screen area to be superimposed is determined so as not to be larger than the area size of the image quality reduction area to be hidden.
[0054]
Finally, a code amount is assigned to each set of the image quality reduction area and the small screen area by the following method. First, the old allocated code amount of the low image quality area is calculated by the equation shown in (Equation 2).
[0055]
(Equation 2): Code amount previously assigned to the image quality reduction area = frame code amount × (area size of the image quality reduction area 領域 area size of the main area)
Although the above description is based on the assumption that the code amount is uniformly allocated in each frame, it is easy to cope with the case where the code amount is not uniform in each frame.
[0056]
Next, the target code amount of the low image quality region and the target code amount of the small screen region are allocated at a fixed ratio (for example, 1 to 9) of the old allocated code amount.
[0057]
In addition, the allocation of the target code amount in the normal area, which is an area other than the image quality reduction area, is performed by the equation shown in (Equation 3).
[0058]
(Equation 3): Normal area target code amount = frame code amount− (sum of target code amounts for low image quality area)
In the main area coding requesting step of step 706, the image quality reduction area management table, the selected area correspondence management table created in step 705, and the normal area target code amount are notified to the main area coding unit 105 and the coding is performed. Make a request.
[0059]
In the auxiliary area encoding request step of step 707, the sub-screen area management table and the selected area relation management table created in step 5 are notified to the auxiliary area encoding unit 107 and an encoding request is made.
[0060]
In the main area result code amount acquisition step of step 708, the main area result code amount, which is the code amount of the result of encoding one frame of the main area, is obtained from the main area coding unit 105.
[0061]
In the supplementary region result code amount acquisition step of step 709, the supplementary region result code amount, which is the code amount of the result of encoding one important region frame, is acquired from the supplementary region encoding unit 108.
[0062]
In the code amount correction value determination step of step 710, a code amount correction value is determined from the frame code amount set in step 702, the main region result code amount acquired in steps 708 and 709, and the auxiliary region result code amount. The determination of the code amount correction value is performed, for example, by the equation shown in (Equation 4).
[0063]
(Equation 4): code amount correction value = frame code amount− (main region result code amount + auxiliary region result code amount)
If the code amount is excessively used, the code amount correction value becomes a negative value, and if the code amount is excessive, the value becomes a positive value. By reflecting the code amount correction value in the frame code amount of the next frame (reducing the code amount used in the next frame if it is overused and increasing the code amount used in the next frame if it is excessive) It is corrected so as to approach the “transmission video target code amount”. Return to step 702.
[0064]
Next, the main area encoding unit 105 transmits the data of the main area 301 received from the main area processing unit 104 and the image quality reduction area management table received from the code amount control unit 106, the selected area correspondence management table, and the normal area ( The main area 301 is encoded based on the target code amount (area other than the image quality reduction area in the main area). As a coding method, a standard method such as MPEG4 may be used, but not limited thereto, and various methods can be used. At the time of encoding, encoding is performed so as to protect the amount of code allocated to the image quality reduction area and the amount of code in the normal area. More specifically, for example, the areas indicated by the image quality reduction areas IDs 1, 2, and 3 (area information indicating which area in the main area is obtained from the selection area management table) are stored in the image quality reduction areas. Encoding is performed so as to keep the target code amount in the normal area while performing with target amounts of 1000, 1500 and 2000 bits, respectively. Thereafter, the code amount control unit 106 is notified of the code amount as a result of the encoding.
[0065]
Generally, at the time of encoding, if the quantization is performed with a large quantization step width, the amount of information can be largely compressed, but the quantization error increases and the image quality deteriorates. Conversely, when quantization is performed with a small quantization step width, the amount of information compression is reduced, but the quantization error is reduced and the image quality is improved. In order to reduce the code amount only in the low image quality area, the quantization step width of the area may be made larger than that of the normal area.
[0066]
Next, the auxiliary area encoding unit 108 generates the data of the small screen areas 303-1 and 303-2 received from the auxiliary area processing unit 107 and the small screen area management table and the selected area correspondence received from the code amount control unit 106. The sub-screen areas 303-1 and 302-2 are encoded based on the management table. More specifically, for example, the areas indicated by the sub-screen area IDs 1, 2, and 3 (area information indicating which area in the auxiliary area is obtained from the selection area management table) are separately set in the sub-screen areas. Encoding is performed so as to approach the target code amount of 12,000, 20,000, or 18000 bits. No coding is performed for other areas. Thereafter, the code amount control unit 106 is notified of the code amount as a result of the encoding. As a coding method for each small screen area, a standard method such as MPEG4 may be used as in the main area coding unit 105, but not limited thereto, and various methods can be used.
[0067]
Next, the video multiplexing unit 109 sets the video data obtained by coding the main area 301 output from the main area coding unit 105 as one object, the small screen area 303-1 output from the auxiliary area coding unit 108, The video data obtained by encoding the code 303-2 is multiplexed into one video data as one object and sent to the video transmission unit 110. As the multiplexing method, an MPEG4 object encoding method may be used, but another method that can be multiplexed on an object basis may be used.
[0068]
Finally, the video transmission unit 110 performs packetization or cellization of the multiplexed video data received from the video multiplexing unit 109 according to the network, and transmits the multiplexed video data to the network. The transmission of the video data is performed at a transmission rate that satisfies the “initial network transmission band” stored in the operation parameter storage unit 111.
[0069]
Furthermore, in the above description, all regions are described as being rectangular, but by using a plurality of vectors, a region can be represented by a shape other than a rectangle.
[0070]
Further, in step 503 of the operation flowchart of the video clipping unit 103, the main area is selected so as to include as much as possible the selected area having the large average motion amount. However, the main area is fixed without considering the selected area. By determining, it is possible to obtain the same effect as the present invention while reducing the processing.
[0071]
When selecting a selection area in step 502 of the operation flowchart of the video clipping unit 103, only an area having a certain size (for example, four motion detection blocks) or more is registered in the selection area management table. As a result, it is possible to obtain the same effects as those of the present invention while reducing the storage area and eliminating the area that is likely to be unnecessary in the subsequent processing.
[0072]
The video clipping unit 103, the main region processing unit 104, and the auxiliary region processing unit 107 determine the importance of the region using the motion vector. Is determined as an important area, and an area where a human face is not included is determined as a non-important area, thereby enabling more efficient video transmission.
[0073]
As described above, in the present embodiment, the main area is cut out from the wide-angle image captured by the high-resolution wide-angle camera so as to include many areas with large motion, and the area with small motion is reduced from the main area. A wide-angle area is provided by selecting a region having a large motion from the auxiliary region as a small-screen region as an image-quality region and encoding each as a separate object, and superimposing the small-screen region on the low-quality region. It is possible to reduce the possibility of overlooking a large change in the other area (auxiliary area) while viewing a part (main area) of the camera image, and the practical effect is large.
[0074]
In addition to the above, by providing a configuration in which the amount of code allocated to the low-quality image area is reduced and the reduced amount of code is allocated to the small-screen area and encoded, the importance of wide-angle camera images in a limited transmission band is reduced. It is possible to transmit the part, and the practical effect is great.
[0075]
In addition, since the encoding is performed without setting the code amount to the low image quality area to 0, even if the small screen area is not displayed, nothing is displayed in black, and the subjective image quality is improved. And the practical effect is great.
[0076]
(Embodiment 2)
FIG. 12 is a diagram illustrating an example of a configuration of a video encoding and transmitting apparatus according to the second embodiment of the present invention. Blocks having the same operation as the configuration of the present apparatus in the first embodiment shown in FIG. 1 are denoted by the same reference numerals. The difference from the first embodiment is that an operation parameter storage unit 1201 that stores operation parameters of the present apparatus, a transmission information reception unit 1202 that receives a transmission state of video data from the video reception terminal 202 and the video storage device 203, A video transmission unit 1203 transmits the multiplexed data to the network, a video cutout unit 1204 cuts out the main area or the auxiliary area from the wide-angle video area, and a code amount control unit 1205 controls the code amount.
[0077]
Regarding the operation of each block, differences from the first embodiment will be described.
[0078]
The transmission information receiving unit 1202, which is a new functional block, receives transmission information such as the “current network transmission band” sequentially transmitted from the video receiving terminal 202 or the video storage device 203. As a method of transmitting the fluctuation state of the network transmission band from the video receiving terminal 202 or the video storage device 203 to the present device, RTCP or the like can be used.
[0079]
The operation parameter storage unit 1202 stores the initial parameters such as the “transmission target video code amount”, the “initial network transmission band”, the “receivable video size”, the “frame rate” described in the first embodiment, and And the variation parameters such as the “current network transmission band” received by the transmission information receiving unit 1202.
[0080]
Unlike the first embodiment, the video transmission unit 1203 sequentially changes the transmission rate according to the “current network transmission band” stored in the operation parameter storage unit 1202.
[0081]
The video clipping unit 1204 changes the size of the main area in consideration of the “current network transmission band” in addition to the operation of the first embodiment. On the other hand, when the transmission band is reduced, the transmission band is increased so that the region size of the main region is reduced (for example, if the transmission band is reduced by 10%, the region size of the main region is reduced by 10%). In this case, the area size of the main area is increased.
[0082]
The code amount control unit 1205 changes the number of image quality reduction areas in consideration of the “current network transmission band” in addition to the operation of the first embodiment. When the transmission band is reduced, the number of the image quality reduction areas is reduced (for example, if the transmission band is reduced by 10%, the area is reduced by one area). The number of image quality areas is increased.
[0083]
Further, in order to reflect the subjective preference of the user using the video receiving terminal 202, the following parameters may be transmitted from the video receiving terminal 202. The transmission information receiving unit 1202 receives these parameters and stores them in the operation parameters 1201. Each functional block performs the operation described in the first embodiment or the second embodiment using the stored parameters.
[0084]
The parameters transmitted from the video receiving terminal 202 are, for example,
A threshold value of the difference between the motion vectors in step 502
・ Threshold value of average motion amount in step 801
A threshold value of the average motion amount in step 901
-Number of image quality reduction areas in step 705
• Allocation ratio of code amount in step 705
It is.
[0085]
As described above, in the present embodiment, in addition to the configuration of the first embodiment, the transmission information from the video receiving terminal is received, and the area size of the main area and the low image quality are adjusted according to the fluctuation of the transmission band of the network. By providing a configuration for changing the area selection method, it becomes possible to present a wide-angle camera image of a quality corresponding to the network fluctuation to the user, and the practical effect is large.
[0086]
In addition to the above, by providing a configuration in which the user can change the threshold of the amount of calm motion or the threshold of the difference between motion vectors that determines the operation of the present invention from the video receiving terminal, the user's subjective preference Video can be presented to the user, and the practical effect is great.
[0087]
(Embodiment 3)
FIG. 13 is a diagram illustrating an example of a selected area correspondence management table of the video encoding and transmitting apparatus according to the third embodiment of the present invention. A “small screen area deformation flag” is added to the selection area correspondence management table of the first embodiment. The “small screen area deformation flag” is used to change the area size of the small screen area and the same area size as the low image quality area when the area size of the low quality area and the area size of the superimposed small screen area are different. Is a determination flag for determining whether If it is ON, it is deformed to have the same size, and if it is OFF, it is not deformed. The effect of this flag will be described with reference to FIG. FIG. 14 is a diagram illustrating an example of a transmission video of the video encoding transmission device according to the third embodiment of the present invention. FIG. 14 shows a state where child screen areas 1402-1 and 1402-2 are overlaid on the main area 1401. The child screen area 1402-1 has the “child screen area deformation flag” turned off, and is displayed with the area size different from that of the image quality reduction area 1403-1. On the other hand, in the child screen area 1402-2, the “child screen area deformation flag” is ON, and the area size is the same as the image quality reduction area 1403-2.
[0088]
The actual deformation processing is performed by the auxiliary area processing unit 107 in accordance with the request from the code amount control unit 106, and the area after the deformation is encoded by the auxiliary area encoding unit 108.
[0089]
As described above, in the present embodiment, when the area size of the image quality reduction area and the area size of the superimposed child screen area are different from each other in the selected area correspondence management table of the first embodiment, By providing a “small screen area deformation flag” which is a flag for determining whether or not the area size of the area is changed to the same area size as the low image quality area, the child area can be displayed without wasting the display area of the low image area. The screen area can be displayed, and the practical effect is great.
[0090]
【The invention's effect】
As described above, according to the present invention, a video image captured by a high-resolution wide-angle camera can be combined with a main area serving as a basic part of coded transmission according to the fluctuation of the transmission band of the network or the magnitude of the motion of the video. It is divided into an auxiliary area which is an area other than the area, an insignificant area (image quality reduction area) is selected from the main area, the amount of code allocated to the area is reduced, and an important area (child Screen area) is superimposed on the non-important area in the encoded main area, so that the entire wide-angle camera image can be encoded and transmitted without encoding the entire wide-angle camera image. Efficient video transmission is possible while improving the browsability.
[Brief description of the drawings]
FIG. 1 is a diagram illustrating an example of a configuration of a video encoding and transmitting apparatus according to a first embodiment of the present invention.
FIG. 2 is a diagram illustrating an example of a usage form of the video encoding transmission device according to the first embodiment of the present invention.
FIG. 3 is a diagram illustrating an example of a wide-angle video area to be processed by the video encoding transmission device according to the first embodiment of the present invention.
FIG. 4 is a diagram showing an example of a transmission video of the video encoding transmission device according to the first embodiment of the present invention.
FIG. 5 is a flowchart illustrating an example of an operation of a video cutout unit of the video encoding transmission device according to the first embodiment of the present invention.
FIG. 6 is a view for explaining motion detection on a block-by-block basis in the video encoding and transmitting apparatus according to the first embodiment of the present invention.
FIG. 7 is a flowchart illustrating an example of an operation of a code amount control unit of the video encoding transmission device according to the first embodiment of the present invention.
FIG. 8 is a flowchart illustrating an example of an operation of a main area processing unit of the video encoding and transmitting apparatus according to the first embodiment of the present invention.
FIG. 9 is a flowchart illustrating an example of an operation of an auxiliary area processing unit of the video encoding and transmitting apparatus according to the first embodiment of the present invention.
FIG. 10 is a diagram illustrating an example of a selection area management table of the video encoding and transmitting apparatus according to the first embodiment of the present invention.
FIG. 11 is a diagram showing an example of a selected area correspondence management table of the video encoding and transmitting apparatus according to the first embodiment of the present invention.
FIG. 12 is a diagram illustrating an example of a configuration of a video encoding transmission device according to a second embodiment of the present invention.
FIG. 13 is a diagram illustrating an example of a selection area correspondence management table of the video encoding transmission apparatus according to the third embodiment of the present invention.
FIG. 14 is a diagram illustrating an example of a transmission video of the video encoding transmission device according to the third embodiment of the present invention.
FIG. 15 is a diagram illustrating a video transmitted by a conventional video processing device.
FIG. 16 is a diagram showing a screen displayed on the videophone device.
[Explanation of symbols]
101 Video input section
102 Video storage unit
103 Video clipping unit
104 Main area processing unit
105 main area coding unit
106 code amount control unit
107 auxiliary area processing unit
108 auxiliary region encoding unit
109 video multiplexing unit
110 Video transmitter
111 Operation parameter storage

Claims

A video encoding transmission device that encodes and transmits a wide-angle camera video as an input video,
An image input unit for acquiring a wide-angle image,
An image storage unit for temporarily storing the wide-angle image,
An image cutout unit that cuts out a first region and a plurality of second regions from the wide-angle image stored in the image storage unit,
A main area processing unit that extracts an insignificant area from the first area,
A main region encoding unit that performs the encoding process of the first region,
An auxiliary area processing unit that extracts an important area from the second area,
An auxiliary area encoding unit that performs encoding processing of the important area;
A code amount control unit that determines the important region to be superimposed on the non-important region, and allocates a code amount to the first region and the important region.
A video multiplexing unit that multiplexes the video data encoded by the main region encoding unit and the auxiliary region encoding unit,
A video transmitting unit that transmits video data multiplexed by the video multiplexing unit to a network,
A video encoding transmission device comprising:

The video encoding / transmission apparatus according to claim 1, wherein the video clipping unit determines the first area so as to include many first specific areas.

The video encoding transmission apparatus according to claim 2, wherein the first specific area is an area including a large number of video areas having a predetermined size or less and having a large motion.

3. The video encoding and transmitting apparatus according to claim 2, wherein the first specific area is an area including a large number of video areas having a predetermined size or less and including many human faces.

The apparatus according to claim 1, wherein the main area processing unit selects a second specific area from the first area and sets the second specific area as an insignificant area.

The video encoding transmission apparatus according to claim 5, wherein the second specific area is an area having a small motion.

The video encoding transmission apparatus according to claim 5, wherein the second specific area is an area that does not include a person's face.

The video encoding transmission apparatus according to claim 1, wherein the auxiliary area processing unit selects a third specific area from the second area and sets the selected area as an important area.

The video encoding transmission apparatus according to claim 8, wherein the third specific area is an area having a large motion.

9. The video encoding and transmitting apparatus according to claim 8, wherein the third specific area is an area including a person's face.

The video encoding transmission device according to claim 1,
A transmission information receiving unit that receives video data transmitted from the video receiving terminal,
A video encoding transmission device, further comprising:

12. The video encoding transmission apparatus according to claim 11, wherein the video clipping unit determines the first specific area using the transmission information received by the transmission information receiving unit.

13. The video encoding transmission apparatus according to claim 11, wherein the main area processing unit determines the second specific area using the transmission information received by the transmission information receiving unit.

15. The video encoding transmission apparatus according to claim 11, wherein the auxiliary area processing unit determines the third specific area using the transmission information received by the transmission information receiving unit.

A method for controlling an encoding amount when a server encodes and transmits a video,
Divide the acquired video into a first area and at least one or more second areas,
Detecting a non-critical area from within the first area, an important area from the second area,
In accordance with the target code amount, lower the coding amount of the non-important region in the first region, so that the important region is superimposed on the non-important region, adjust the video coding amount to be transmitted,
A coding amount adjusting method, characterized in that: