JP4010270B2

JP4010270B2 - Image coding and transmission device

Info

Publication number: JP4010270B2
Application number: JP2003098534A
Authority: JP
Inventors: 智坂爪
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 2003-04-01
Filing date: 2003-04-01
Publication date: 2007-11-21
Anticipated expiration: 2023-04-01
Also published as: JP2004312059A

Description

【０００１】
【発明の属する技術分野】
本発明は画像符号化伝送装置に係り、特に画像信号の符号化時に任意のフレームを参照フレームとして利用して符号化して得られた符号化ビットストリームを、蓄積媒体やネットワークによる伝送路で伝送させる任意参照フレームを利用した画像符号化伝送装置に関する。
【０００２】
【従来の技術】
近年、コンピュータのネットワークを経由して、視聴したい動画像にアクセスすることができる映像配信システムが普及してきている。この映像配信システムは、複数のネットワークに対して相互に接続されており、各ネットワークには、映像情報を配信する複数のサーバが接続されている。また、映像情報が配信される複数の映像受信再生端末装置（以下単に端末装置という。）が、各ネットワークに接続されている。
【０００３】
上記の映像配信システムでは、端末装置がサーバから映像情報を配信される場合、まず、この端末装置はサーバとの間で連絡をとり、端末装置とサーバの間で回線の接続状態を成立させる。その後、端末装置は、サーバから提供される所望の映像情報を受信し、この映像情報を再生する。
【０００４】
このような映像配信システムにおいて、一般に伝送される映像情報は、ディジタル化された音声、画像、その他のデータが組み合わされて作成されたマルチメディア情報であることが多い。このような映像情報（マルチメディア情報）の情報量は、単純にデータをディジタル化して換算した場合、一般に文字情報の情報量については１〜２バイトである。音声の場合には電話品質で６４ｋｂｐｓである。動画の場合には現行の地上波テレビ放送品質で１００Ｍｂｐｓ以上の情報量が必要になる。このような膨大なディジタル情報をそのままの状態で扱うことは、現状における伝送路や記録媒体を利用する際のコストを勘案すると現実的ではない。
【０００５】
そこで、このような膨大な情報量を圧縮するための符号化技術が現在盛んに研究開発されている。映像の符号化技術において代表的なものとしては、ＭＰＥＧ（Moving Picture Experts Group）が挙げられる。ＭＰＥＧは、動画像データの符号化技術に関する国際標準規格である。現在、このＭＰＥＧ標準規格には、主にビデオＣＤなどの１.５Ｍｂｐｓ程度の蓄積メディア向けのＭＰＥＧ−１、ディジタル放送などのような通信や放送などの多様なアプリケーションに対応することを可能にするＭＰＥＧ−２、オブジェクト符号化やその制御によってより高度なマルチメディアコンテンツを実現することができるＭＰＥＧ−４、コンテンツ記述向けのＭＰＥＧ−７が規定されている。
【０００６】
このような符号化技術を適用することで膨大な情報量を持つ映像データを効率良く符号化し、情報量をおおむね１／２０〜１／４０程度にまで圧縮することが可能になる。また、他の符号化方式では、現在規格策定中のＨ.２６４や、ＪＰＥＧ２０００（Joint Photographic Experts Group 2000）などのような新たな符号化方法を導入したものも存在し、これらを導入することでより効果的な情報の符号化を行うことができる。
【０００７】
一方、ＭＰＥＧは複数の技術を組み合わせることでより効果的な圧縮ができるように構成されている。図８は従来のＭＰＥＧ方式の符号化装置の一例のブロック図を示す。この符号化装置は、ＭＰＥＧにより画像情報を符号化する装置で、入力符号３０１を動き補償予測器３０５で復号化した画像信号と、入力符号３０１との差分をとることで、入力符号３０１の時間軸方向に含まれる冗長成分を削減する。
【０００８】
ＭＰＥＧには、フレーム内（Intra）符号化、前方向（Predictively）予測符号化、双方向（Bidirectionaly）予測符号化の３つの予測モードが存在する。このうち、フレーム内符号化では、動き補償予測器３０５の出力を利用せずにフレーム内の情報だけを利用して符号化を行う。このモードで符号化されたフレームをＩピクチャもしくはＩＶＯＰ（Intra Video Object Plane）と呼ぶ。ＩＶＯＰは、復号化時に他のフレームに依存することなく復号することができる。
【０００９】
前方向予測符号化では、動き補償予測器３０５により過去に符号化したフレームを補償し、現フレームの予測を行う。この差分フレームに対して直交変換器３０３による変換が行われる。直交変換器３０３で行われる直交変換は、ＭＰＥＧではＤＣＴ（Discrete Cosine Transform）が使用される。直交変換は、アダマール基底や、ウェーブレット基底といった他の変換基底を用いても行うことができる。このモードで符号化されたフレームをＰピクチャもしくはＰＶＯＰと呼ぶ。ＰＶＯＰは、他のフレームに依存するため、ＩＶＯＰのように独立して復号することはできないが、ＩＶＯＰよりも圧縮率を高めることができる。
【００１０】
双方向予測符号化では、動き補償予測器３０５により過去と未来の符号化されたフレームを補償し、現フレームの予測を行う。この差分フレームに対して直交変換器３０３による変換が行われる。このモードで符号化されたフレームをＢピクチャもしくはＢＶＯＰと呼ぶ。ＢＶＯＰは、過去と未来の他のフレームに依存するため、ＩＶＯＰ、ＰＶＯＰのように独立して復号することはできないが、ＩＶＯＰ、ＰＶＯＰよりも更に圧縮率を高めることができる。
【００１１】
各ＶＯＰは、１６画素×１６画素のＭＢ（マクロブロック）ごとに予測処理が行われる。予測方向はＩＶＯＰ、ＰＶＯＰ、ＢＶＯＰにより異なっている。全てのＭＢを独立して符号化するのがＩＶＯＰである。過去のフレームからの予測により符号化するモードと、予測をしないでそのＭＢを独立して符号化するモードの２つのモードが存在するのがＰＶＯＰである。また、未来からの予測、過去からの予測、両方からの予測、予測をしないでそのＭＢを独立して符号化する４つのモードが存在するのがＢＶＯＰである。
【００１２】
動き推定器３０４は、入力符号の動き領域をＭＢ毎にパターンマッチングを行って０.５画素精度で動きベクトルを検出する。動き補償予測器３０５は、動き推定器３０４で検出された動きベクトルを利用して動き分だけシフトしてから予測する。動きベクトルは水平方向と垂直方向が存在し、何処からの予測かを示すＭＣ（Motion Compensation）モードとともにＭＢの付加情報として伝送される。
【００１３】
フレームメモリ３０２から読み出された１フレーム分の入力符号３０１と、動き補償予測器３０５からの動きベクトルとを、減算器３１４で減算して得られた差分画像信号は直交変換器３０３に供給され、ここで直交変換される。この直交変換は、ＭＰＥＧではＤＣＴを利用する。ＤＣＴは、余弦関数を基底とした積分変換を有限空間へ離散変換する直交変換である。ＭＰＥＧではＭＢを４分割した８×８のＤＣＴブロックに対して、２次元ＤＣＴを行う。一般に、ビデオ信号は低域成分が多く高域成分が少ないため、ＤＣＴを行うことで係数を低域成分に集中させることができ、その後の量子化器３０６により効率の良い情報量の削減ができる。
【００１４】
直交変換器３０３においてＤＣＴによって得られたＤＣＴ係数は、量子化器３０６により量子化される。この量子化は量子化マトリックスという８×８の２次元周波数を視覚特性で重み付けした値と、その全体をスカラー倍する量子化スケールという値で乗算した値を量子化値として、ＤＣＴ係数をその量子化値で除算する。デコーダで逆量子化するときは量子化値で乗算することにより、元のＤＣＴ係数に近い値を得ることになる。
【００１５】
量子化されたデータはエントロピー符号化器３１０で符号化される。一般に、エントロピー符号化器３１０は、ＶＬＣ（Variable Length Code）器であり、量子化されたデータに対して可変長符号化（ＶＬＣ）を行う。量子化された値のうち直流（ＤＣ）成分は、予測符号化の一つであるＤＰＣＭ（Differential Pulse Code Modulation）を使用する。また、量子化された値のうち交流（ＡＣ）成分は、低域から高域にジグザグスキャンを行い、ゼロのラン長および有効係数値を１つの事象とし、出現確率の高いものから符号長の短い符号を割り当てていくハフマン符号化が行われる。
【００１６】
また、エントロピー符号化器３１０は、量子化されたデータだけでなく、動き推定器３０４及び動き補償予測器３０５から出力される動きベクトルに関する補助情報も所定の条件に従って符号化し、それぞれエントロピー符号化したデータを、多重化器３１１を通して一時バッファメモリ３１２に蓄え、所定の転送レートで符号化データ（出力符号）３１７として出力する。また、その出力符号３１７のマクロブロック毎の符号量は、符号量制御器３１３に通知される。符号量制御器３１３は、通知された符号量と目標とする符号量との誤差符号量を量子化器３０６に通知する。量子化器３０６は、通知された誤差符号量の情報をもとに量子化スケールを調整することで符号量を制御することができる。
【００１７】
量子化器３０６から出力された量子化データは、逆量子化器３０７によって逆量子化され、更に逆直交変換器３０８によって直交変換器３０３に対応する変換基底により逆直交変換される。逆量子化器３０７と逆直交変換器３０８は、ローカルデコード部３１５を構成している。逆直交変換器３０８から出力された信号は、加算器３１６において動き補償予測器３０５からのフレームと加算されて復号化画像とされた後、参照予測メモリ３０９に一時格納される。参照予測メモリ３０９に一時格納されたフレームは、動き補償予測器３０５において、差分画像を計算するための参照フレーム（参照復号化画像）として使用される。
【００１８】
ところで、あるＩピクチャから次のＩピクチャの前のピクチャまでをＧＯＰ（Group Of Picture）という。また、あるＩＶＯＰから次のＩＶＯＰの前のＶＯＰまでをＧＯＶ（Group Of Vop）という。蓄積メディアなどで使用される場合には、一般に約１５ピクチャ程度が一つのＧＯＰ、ＧＯＶとして使用される。ここではＧＯＶを構成するための構造を図９に示す。
【００１９】
同図に示すように、ＧＯＶの開始コードは３２ビット、シーケンスの先頭からの時間を示すタイムコードは１８ビット、ＧＯＶ内の画像が他のＧＯＶからの独立再生か否かを示すクローズド・ｇｏｖが１ビット、先行するＧＯＶデータが編集のために使用不可であるか否かを示すブロークン・リンクが１ビットで表される。
【００２０】
図８に示した構成の画像符号化伝送装置から出力された符号化ビットストリームは、蓄積媒体やネットワークによる伝送により配信され、端末装置によって再生されることになる。端末装置で再生する場合には、配信された符号化ビットストリームは端末装置のＭＰＥＧ復号化器によって復号された後に再生される。
【００２１】
図１０は従来のＭＰＥＧ復号化器の一例のブロック図を示す。同図において、端末装置に配信された符号化ビットストリームは、入力符号４０１としてＭＰＥＧ復号化器に入力されると、多重化分離器４０２に供給される。多重化分離器４０２は、入力符号４０１をエントロピー符号化された動きベクトル情報やテクスチャ情報などの各情報に分離し、各情報をエントロピー復号化器４０３に供給する。
【００２２】
エントロピー復号化器４０３は、エントロピー符号化されたテクスチャ情報を復号し、逆量子化器４０４に供給すると共に、エントロピー符号化された動きベクトル情報を復号し、動き補償予測器４０７に供給する。一般に、エントロピー復号化器４０３は、ＩＶＬＣ（Inverse Variable Length Code）器であり、多重化分離されたデータに対して可変長復号化を行う。逆量子化器４０４は、エントロピー復号化されたテクスチャ情報に対して、量子化値で乗算することにより、元の直交変換係数に近い値を計算し、その計算結果を逆直交変換器４０５に供給する。
【００２３】
動き補償予測器４０７は、マクロブロックを単位として、現在符号化しようとしているマクロブロックの位置を基準として、参照予測メモリ４０６に格納されている参照フレームから現在符号化しようとしているマクロブロックに対応する領域を動きベクトル情報を逆に利用することで特定する。そして、動き補償予測器４０７は、特定された参照フレーム上の領域をエントロピー復号化器４０３から得られる動きベクトルの情報分だけシフトして、現在符号化しようとしているマクロブロックの位置に配置することで、予測フレームを作成する。逆直交変換器４０５は、逆量子化器４０４から入力された直交変換係数を逆直交変換することにより、元のフレーム情報、もしくはフレーム間差分情報を得る。逆直交変換は、ＭＰＥＧではＩＤＣＴを利用する。
【００２４】
ＩＤＣＴによって得られたフレームは、加算器４０８において、動き補償予測器４０７からの予測フレームと加算された後に、出力符号４０９として出力されると共に、参照予測メモリ４０６に格納される。このようにして、入力された入力符号４０１は、端末装置のＭＰＥＧ復号化器によって復号され、映像情報として再生される。
【００２５】
【発明が解決しようとする課題】
従来は、伝送路を介して映像情報を伝送しようとした場合、画像受信復号化装置側で同期等の問題を回避するために、一般的なＭＰＥＧ方式の構成を利用して、ネットワーク伝送ができるように構成した画像符号化伝送装置側では、ＧＯＰもしくはＧＯＶを単位として映像情報を符号化する。通常、画像符号化伝送装置により映像情報をＧＯＶ単位で符号化して得られた符号化ビットストリームは、一般的なＭＰＥＧ方式の構成を利用して、ネットワーク伝送ができるように構成した画像受信復号化装置に対して伝送路を介してＧＯＶの先頭から順番に伝送される。
【００２６】
図１１は従来の画像符号化伝送装置の一例のブロック図、図１２は従来の画像受信復号化装置の一例のブロック図を示す。図１１に示すように、従来の画像符号化伝送装置は、符号化部５０１と、符号化伝送部５０２とから構成されており、一般に、符号化伝送部５０２では、伝送時に伝送路で効率良くデータを伝送するために、ＧＯＶをより小さいパケットに分割して伝送することが行われる。
【００２７】
符号化部５０１は、入力手段５０３からの画像信号を一時記憶する緩衝バッファ５０４と、入力画像信号を少なくとも１フレーム分記憶する複数フレーム格納用メモリ５０６と、ＧＯＶ管理手段５０５と、複数フレーム格納用メモリ５０６から出力された画像信号をＭＰＥＧ方式で符号化するエンコード部５１０と、符号化フレームを蓄積する緩衝バッファ５０９と、符号化フレームを復号化するデコード部５１１と、参照フレームを生成してエンコード部５１０へ供給するフレーム参照予測用メモリ５１２とからなる。
【００２８】
ＧＯＶを符号化単位として伝送するために、ＧＯＶ管理手段５０５は、複数フレーム格納用メモリ５０６、ＧＯＶ格納バッファメモリ５１３の格納状態を監視しながら、必要に応じてスイッチＳＷ２１、ＳＷ２２を制御することで、ＧＯＶ単位の符号化出力を符号化伝送部５０２へ出力する。
【００２９】
符号化伝送部５０２のＧＯＶ格納バッファメモリ５１３は、符号化部５０１内エンコード部５１０から緩衝バッファ５０９及びスイッチＳＷ２２を介して入力された符号化フレームを一時格納した後、伝送パケット送出手段５１４へ出力する。伝送パケット送出手段５１４は、ネットワーク伝送する際の伝送単位である少なくとも１つのパケットに分割し、ネットワークの伝送帯域の状態を監視しながら、パケットをネットワーク若しくは格納手段５１５へ送出する。
【００３０】
また、図１２に示すように、従来の画像受信符号化装置は、図１１の符号化伝送部５０２によってネットワーク又は格納手段５１５（６０３）へ送出された少なくとも１つのパケットを、受信して復号化する復号化受信部６０１と、復号化したデータを復号化する復号化部６０２と、復号化部６０２からの画像信号を１フレーム分格納してから表示装置６１５に供給して画像表示させる表示用フレームメモリ６１４とからなる。
【００３１】
復号化受信部６０１は、入力されたパケットを、伝送パケット受信手段６０４で受信し、符号化ビットストリーム解析手段６０５によってビットストリームを解析し、得られた符号化フレーム情報を、フレーム格納順制御手段６０６により制御されるスイッチＳＷ３１を介してＧＯＶ格納バッファメモリ６０８に入力し、復号化単位であるＧＯＶを再構成する。
【００３２】
復号化部６０２内のＧＯＶ管理手段６０９は、ＧＯＶ格納バッファメモリ６０８の格納状態を監視しながら、必要に応じてスイッチＳＷ３２を制御することで、ＧＯＶ単位の符号化データを緩衝バッファ６１１を介してデコード部６１２に送る。デコード部６１２は、フレーム参照予測用メモリ６１３からの参照フレームを使用してＧＯＶ単位で復号化を行い、再生映像信号を得て表示用フレームメモリ６１４に供給する。表示用フレームメモリ６１４に格納された１フレーム分の再生映像信号は、表示装置６１５により画像表示される。
【００３３】
ここで、図１１の画像符号化伝送装置から図１２の画像受信復号化装置に対して伝送路を介して符号化ビットストリームを伝送する際に、伝送レートの変動が激しい場合には、パケットの欠落によりＧＯＶ内の情報を完全に再構成することができずにＧＯＶの復号に影響を与えることがある。
【００３４】
また、伝送レートの急激な低下により現在伝送しているＧＯＶのビットレートを満たすことができないためにパケットの遅延や再送が行われることでＧＯＶを再構成するまでに時間がかかり、結果としてＧＯＶの伝送が遅延することがある。
【００３５】
このようにＧＯＶの伝送が十分でない場合に、画像受信復号化装置では再生時刻に間に合わなかった不完全なＧＯＶを破棄してしまうことにより、１ＧＯＶ分再生が停止してしまうといった問題が発生する。
【００３６】
また、画像受信復号化装置において、まだ不完全なＧＯＶに対してＧＯＶの再生時刻がきてしまった場合、不完全なＧＯＶを破棄せずに復号ができるところまで復号を行い、再生表示をできる限り継続することもできる。しかし、この場合であっても、再生時刻までに伝送できなかったＧＯＶの後半部分に存在していたＶＯＰは復号することができないため、次のＧＯＶまで再生が停止することになる。
【００３７】
従来は、このようなＧＯＶの欠落や遅延を回避するため、画像符号化伝送装置と画像受信復号化装置間の伝送路の伝送レートを随時監視し、伝送レートの急激な低下を検出した場合には、画像符号化伝送装置に対して符号化ビットレートを下げるように要求する。画像符号化伝送装置では、要求された符号化ビットレートに応じて空間解像度やフレームレート及び画像品質を下げることによって符号化ビットレートを切り替えて、より低いビットレートの符号化ビットストリームが生成されるように切り替えて、符号化伝送部５０２によって伝送することが考えられる。
【００３８】
しかし、このような伝送路の伝送ビットレートを監視しながら符号化ビットレートの変更を指示するためには、画像受信復号化装置の復号化受信部に大きめのＧＯＶ受信バッファを用意して、伝送路の伝送ビットレートを検出する必要がある。このＧＯＶ受信バッファの蓄積状態を監視することで伝送路の伝送ビットレートを検出することから、伝送ビットレートの変動を検出するまでに一定の時間差が生じることになる。この時間差によって伝送ビットレートが低い状態の伝送路に対して符号化ビットレートが高い状態のままの符号化ビットストリームを伝送し続けようとするため、伝送遅延が拡大することになる。
【００３９】
更に、図１１に示した画像符号化伝送装置内の符号化伝送部５０２において、符号化部５０１内のエンコード部５１０から緩衝バッファ５０９及びスイッチＳＷ２２を介して出力された符号化ビットストリームが、ＧＯＶ格納バッファメモリ５１３に格納された後、伝送パケット送出手段５１４によりＧＯＶがパケット化されて、ネットワーク又は格納手段５１５である伝送路に対して伝送される場合について考える。
【００４０】
符号化部５０１から出力された符号化ビットストリームが、ＧＯＶ格納バッファメモリ５１３に格納された後、伝送パケット送出手段５１４によりＧＯＶのパケット化と、伝送路に対する送出を行っている途中で、伝送路の伝送ビットレートが低下したことを画像符号化伝送装置が検知した場合、現在送出中のＧＯＶの符号化ビットレートを途中から変更しようとしていることを想定する。
【００４１】
通常の画像符号化伝送装置では、符号化伝送部５０２のＧＯＶ格納バッファメモリ５１３にＧＯＶを格納してしまうと、格納したＧＯＶに対する符号化ビットレートの修正を行うことができないため、少なくとも１ＧＯＶ分だけ符号化ビットレートの制御に時間差が生じることになる。
【００４２】
より高度な画像符号化伝送装置であれば、現在まだ伝送路に対して送出していない送出途中のＧＯＶを解析して、伝送路に対してまだ送出していないＶＯＰを特定する。その後、ＧＯＶ格納バッファメモリ５１３から現在送出途中のＧＯＶを削除して、符号化部５０１に対して再度特定したＶＯＰから再符号化要求を行う。特定したＶＯＰから残りのフレーム数分のＶＯＰを取得することができれば、ＧＯＶの伝送遅延が少なく、画像受信復号化装置側での再生品資の低下も少なく抑えることができる。
【００４３】
しかし、このような画像符号化伝送装置、画像受信復号化装置の構成では、ライブ映像などをリアルタイムで配信しているような場合には、画像符号化伝送装置側に、再符号化処理ができるようにするための大量のフレーム格納バッファメモリが必要になる。また、符号化部５０１の複雑な制御が必要になるという問題がある。
【００４４】
本発明は以上の点に鑑みなされたもので、画像符号化伝送装置から画像受信復号化装置に対して伝送路を介して符号化ビットストリームを伝送する際に、伝送レートの変動が激しい場合において、従来法ではパケットの欠落によりＧＯＶ内の情報を完全に再構成することが難しい場合でも、ＧＯＶ内の符号化フレームの符号化方法と伝送順を考慮することで、画像受信復号化装置側のデコード時間までに伝送されたＧＯＶの情報を利用して、デコード時間までに蓄積されたＧＯＶ内のフレーム数によって復号側のフレームレートが自動的に変化し、できる限り等間隔でフレームが再生表示できるようにし、もって、ＧＯＶの復号に与える影響を少なくし、従来よりも高品質の画像伝送を実現し得る画像符号化伝送装置を提供することを目的とする。
【００４５】
【課題を解決するための手段】
本発明は上記の目的を達成するため、入力画像信号を参照フレームを利用して符号化を行い、得られた符号化フレームを予め定められた所定のフレーム送出順に従って送出する画像符号化伝送装置であって、入力画像信号を少なくとも１フレーム分格納する第１のフレーム格納用メモリと、符号化時に使用する参照フレームを所定数格納する第２のフレーム格納用メモリと、第１のフレーム格納用メモリの出力側に設けられた第１のスイッチ手段と、第２のフレーム格納用メモリの入力側と出力側に設けられた第２のスイッチ手段と、予め定められたフレーム送出順に従い、第１及び第２のスイッチ手段を制御して、第１及び第２のフレーム格納用メモリに格納されている入力画像信号及び参照フレームの受け渡しを制御する参照フレーム制御手段と、第１のフレーム格納用メモリに格納されている入力画像信号の中から第１のスイッチ手段により選択されたフレーム番号の入力画像信号を、第２のフレーム格納用メモリに格納されている参照フレームの中から第２のスイッチ手段により選択された一の参照フレームを利用して、符号化して符号化フレームを得るエンコード部と、エンコード部から出力された符号化フレームを復号し、第２のフレーム格納用メモリに格納されている参照フレームの中から、エンコード部による符号化時に使用した一の参照フレームを用いて、新たに参照フレームを生成し、その新たな参照フレームを、第２のスイッチ手段を介して第２のフレーム格納用メモリに新たに格納するローカルデコード部と、第１のフレーム格納用メモリの蓄積状態の監視を行って蓄積状態の管理をし、エンコード部から出力される符号化フレームの監視を行い、符号化出力の管理をし、これらの状態を参照フレーム制御手段に通知する入出力フレーム管理手段と、エンコード部によって生成された符号化フレームのまとまりを格納する第３のフレーム格納用メモリと、第３のフレーム格納用メモリに格納された符号化フレームのまとまりに含まれる各フレームをパケット化して、予め定められたフレーム送出順に従って送出する出力手段とを有する構成としたものである。
【００４６】
この発明では、第１のフレーム格納用メモリに格納されている入力画像信号の中から第１のスイッチ手段により選択されたフレーム番号の入力画像信号を、第２のフレーム格納用メモリに格納されている参照フレームの中から第２のスイッチ手段により選択された一の参照フレームを利用した符号化を行って符号化フレームを得るエンコード部から出力された当該符号化フレームを、ローカルデコード部で復号し、第２のフレーム格納用メモリに格納されている参照フレームの中から、エンコード部による符号化時に使用した一の参照フレームを用いて、新たに参照フレームを生成し、その新たな参照フレームを、第２のスイッチ手段を介して第２のフレーム格納用メモリに新たに格納するようにしたため、エンコード部の符号化時には、第２のフレーム格納用メモリに格納されたフレームを任意に参照フレームとして利用することができる。
【００４７】
ここで、上記の出力手段は、第３のフレーム格納用メモリに格納された符号化フレームのまとまりに含まれる各フレームを特定するための符号化ビットストリーム解析手段と、第３のフレーム格納用メモリに格納された符号化フレームのまとまりに含まれる各フレームを伝送路に対して所定のフレーム送出順に従って送出する順序を制御するためのフレーム送出制御手段を備える。
【００４８】
符号化ビットストリーム解析手段は、第３のフレーム格納用メモリに格納されたＧＯＶ内に含まれる各符号化フレームを特定する。フレーム送出制御手段は、符号化ビットストリーム解析手段が特定した各符号化フレームの位置情報を利用して、第３のフレーム格納用メモリに格納されているＧＯＶ内の各符号化フレームの並び順を、ＧＯＶを伝送路に送出する際の所定のフレーム送出順に従って変更することが可能となる。
【００４９】
また、上記の目的を達成するため、本発明は、参照フレーム制御手段を、演算手段により決定された送出フレーム番号を、決定された順に予め定められたフレーム送出順として第１及び第２のスイッチ手段を制御する手段とし、演算手段が、符号化フレームのまとまりに含まれる符号化フレーム数をＭとし、第１のフレーム格納用メモリに格納されている入力画像信号のフレーム番号は０からＭ−１までの自然数であるとすると、フレーム番号０を最初の送出フレーム番号とし、変数Ａの初期値をＭ、カウント値Ｃの初期値を１に初期化する初期化手段と、初期化手段により初期化後に、次式
Ｂ＝［（Ａ＋１）／２］（［］はガウス記号）
によりＢの値を算出すると共に、変数ｎを０とする第１の算出手段と、第１の算出手段により求めたＢの値と変数ｎの値を用いて、次式
Ｄ＝Ｂ＋２×Ｂ×ｎ（ｎは正の整数）
によりＤを算出する第２の算出手段と、第２の算出手段により算出したＤの値がＭ未満であるか否か判定する第１の判定手段と、第１の判定手段により、Ｄ＜Ｍと判定されたときは、そのときのＤの値を送出フレーム番号に決定すると共に、変数ｎ及びカウント値Ｃをそれぞれ１インクリメントして、第２の算出手段により再度Ｄを算出させる送出フレーム番号決定手段と、第１の判定手段により、ＤがＭ以上であると判定されたときは、カウント値ＣがＭより大であるか否か判定する第２の判定手段と、第２の判定手段により、カウント値ＣがＭ以下であると判定されたときには、Ａの値をその時点のＢの値に設定した後、第１の算出手段によりＢを算出させる設定手段とからなる手段であり、第２の判定手段により、カウント値ＣがＭより大であると判定されるまで上記の演算を行うことを特徴とする。
【００５０】
この発明では、予め定められたフレーム送出順を、演算手段により決定された送出フレーム番号の決定された順とし、これを参照フレーム制御手段と出力手段で利用することによって、画像受信復号化装置側で受信した符号化フレーム順に復号した場合であっても、ＧＯＶ単位内で符号化フレームを復号する度にフレームレートが向上するように符号化伝送することができる。
【００５１】
また、本発明は、上記参照フレーム制御手段において、予め定められた送出フレーム順に基づいて、現在符号化しようとしているフレームよりも前に既に伝送されることになっている少なくとも１つのフレーム位置の少なくとも１つの参照フレームを利用して符号化が行われるように第１のフレーム格納用メモリ、第２のフレーム格納用メモリからエンコード部及びローカルデコード部に対するフレームの入出力を管理することを特徴とする。この管理によって、画像受信復号化装置側で受信した符号化フレーム順に復号した場合であっても、ＧＯＶ単位内で符号化フレームを復号する度に、フレームレートが向上するように符号化出力を制御することができる。
【００５２】
また、本発明は、上記参照フレームにおいて、エンコード部からの参照フレーム切り替え要求に応じて、前記演算手段によって求められた送出フレーム順に基づいて、現在符号化しようとしているフレームよりも前に既に伝送されることになっている少なくとも１つのフレーム位置の少なくとも１つの参照フレームのうち、現在符号化しようとしているフレームを符号化する際に、まだ参照フレームもしくは複数の参照フレームの組み合わせとして利用していない、別の参照フレームもしくは複数の参照フレームの組み合わせに切り替えて再度符号化が行われるように上記第１のフレーム格納用メモリ、第２のフレーム格納用メモリから上記エンコード部、ローカルデコード部に対するフレームの入出力を管理することを特徴とする。
【００５３】
また、本発明は、エンコード部において、符号化時の符号化ビットレートを監視する手段を備え、所定の符号化出力条件に応じて参照フレーム制御手段に対して再エンコードを行うために、参照フレーム切り替え要求を行う手段を備えることを特徴とする。これにより、所定の符号化出力条件が満たされる。
【００５４】
また、本発明は、所定の符号化出力条件が、請求項２の演算手段によって求められた送出フレーム順に基づいて、現在符号化しようとしているフレームよりも前に既に伝送されることになっている少なくとも１つのフレーム位置の少なくとも１つの参照フレームの中から最小の符号化ビットレートになった時点で符号化フレームを出力することであることを特徴とする。この発明では、送出フレーム順に基づいて、ＧＯＶ内の各フレームに対して符号化を行う際に、最適な符号化出力を得ることができる。
【００５５】
更に、本発明は、エンコード部によって生成された少なくとも１つの符号化フレームのまとまりを格納することができる第３のフレーム格納用メモリの内容を、次の少なくとも１つの符号化フレームのまとまりがエンコード部によって生成され、伝送する時刻になった時点で破棄し、エンコード部によって生成された次の少なくとも１つの符号化フレームのまとまりを速やかに格納することを特徴とする。この発明では、伝送路の伝送ビットレートを常に監視することなく符号化ビットレートを制御し、伝送ビットレートの変動を検出するまでにかかる一定の時間差による伝送遅延の拡大や、ＧＯＶの復号に与える影響を軽減することができる。
【００５６】
【発明の実施の形態】
次に、本発明の実施の形態について図面と共に説明する。図１は本発明になる任意参照フレームを利用した画像符号化伝送装置の一実施の形態のブロック図を示す。この実施の形態の画像符号化伝送装置は、主に入力画像を少なくとも１つの参照フレームを利用して符号化することができる符号化部１０１と、少なくとも１つの符号化フレームのまとまりで構成されたＧＯＶを伝送路に対して効率良く伝送することができる符号化伝送部１０２と、符号化伝送したい画像を符号化部１０１が取得できるようにすることができる入力手段１０３と、符号化フレームを画像受信復号化装置に伝送することができるネットワークもしくは格納手段１２２とから構成される。
【００５７】
更に、本実施の形態において符号化部１０１は、緩衝バッファ１０４、スイッチＳＷ１〜ＳＷ５、第１複数フレーム格納用メモリ１０６、入出力フレーム管理手段１０７、緩衝バッファ１０９、参照フレーム制御手段１１０、エンコード部１１１、第２複数フレーム格納用メモリ１１３、ローカルデコード部１１６によって構成される。ここで、符号化部１０１は、図１１に示した符号化部５０１のフレーム参照予測メモリ５１２が第２複数フレーム格納用メモリ１１３に拡張された構成となっており、それに伴い、図１１のＧＯＶ管理手段５０５が図１では参照フレーム制御手段１１０及び入出力フレーム管理手段１０７に拡張された構成である。
【００５８】
また、図１のローカルデコード部１１６は、図８のローカルデコード部３１５に対応し、図１の第２複数フレーム格納用メモリ１１３は、図８の参照予測メモリ３０９に対応し、エンコード部１１１は、図８の残りの部分に対応する。
【００５９】
符号化部１０１の各構成要素に関する説明を以下に述べる。緩衝バッファ１０４は、入力手段１０３から入力される画像フレームを一時的に複数フレーム格納するのに十分な量のバッファ領域を持ち、第１複数フレーム格納用メモリ１０６にＧＯＶ分のフレームを格納する際の調整を行うためのバッファである。
【００６０】
スイッチＳＷ１は、緩衝バッファ１０４と第１複数フレーム格納用メモリ１０６との間の接続を管理するためのものである。スイッチＳＷ１は、入出力フレーム管理手段１０７により接続や切断及び切り替えを行う。第１複数フレーム格納用メモリ１０６は、エンコード部１１１がＧＯＶ単位で符号化を行うことができるように、少なくともＧＯＶを構成する際に必要な入力フレーム数分を格納することができる必要がある。
【００６１】
例えば、入力画像のフレームレートが３０［ｆｐｓ］、１ＧＯＶが０.５秒、１ＧＯＶ内に格納される符号化フレーム数が１５である場合には、第１複数フレーム格納用メモリ１０６は、少なくとも１５フレーム分の入力画像を格納することができるメモリ領域を確保している。
【００６２】
入出力フレーム管理手段１０７は、符号化をＧＯＶ単位で制御するために、緩衝バッファ１０４から第１複数フレーム格納用メモリ１０６に対する入力フレームの格納を、スイッチＳＷ１の接続や切断及び切り替えを行うことによって管理する。また、入出力フレーム管理手段１０７は、緩衝バッファ１０９からの出力信号をスイッチＳＷ５の接続や切断及び切り替えを行うことによって管理する。更に、入出力フレーム管理手段１０７は、ＧＯＶ単位の符号化を開始する要求を参照フレーム制御手段１１０に通知することにより、ＧＯＶ単位で符号化を開始するタイミングを管理する。
【００６３】
すなわち、入出力フレーム管理手段１０７は、第１複数フレーム格納用メモリ１０６にＧＯＶを符号化するために必要な入力フレームが格納されているかを監視する。第１複数フレーム格納用メモリ１０６にＧＯＶ単位で符号化するために十分な入力フレームが格納されていることを確認することで、ＧＯＶ単位でＧＯＶ内に含まれる任意のフレーム順で符号化を行うことができるようになる。
【００６４】
また、エンコード部１１１から出力される符号化フレームの監視を行うことで、第３複数フレーム格納用メモリ１１７の格納状態を制御することが可能となる。更に、第１複数フレーム格納用メモリ１０６と第３複数フレーム格納用メモリ１１７の格納状況を参照フレーム制御手段１１０に通知することで、参照フレーム制御手段１１０がＧＯＶ単位での符号化の開始タイミングや、少なくとも１つの参照フレームを制御することが可能となる。
【００６５】
スイッチＳＷ５は、緩衝バッファ１０９に一時的に格納されているＧＯＶの出力タイミングを管理するためのものである。このスイッチＳＷ５の接続や切断及び切り替えのタイミングは、入出力フレーム管理手段１０７によって管理されている。緩衝バッファ１０９は、エンコード部１１１から出力されるＧＯＶの符号化フレームを、ＧＯＶ単位で符号化伝送部に対して出力するために、一時的に格納するための緩衝バッファである。
【００６６】
スイッチＳＷ２は、第１複数フレーム格納用メモリ１０６とエンコード部１１１との間の接続を管理するためのスイッチで、参照フレーム制御手段１１０により接続や切断及び切り替え制御される。第２複数フレーム格納用メモリ１１３は、ローカルデコード部１１６が生成した参照フレームをＧＯＶ内の対応するフレーム位置に格納することができるように、少なくともＧＯＶを構成する際に必要なフレーム数を格納することができるフレームメモリを備える必要がある。例えば、入力画像のフレームレートが３０［ｆｐｓ］、１ＧＯＶが０.５秒、１ＧＯＶ内に格納される符号化フレーム数が１５である場合には、第２複数フレーム格納用メモリ１１３は、少なくとも１５フレーム分の参照フレームを格納することができるメモリ容量を確保する。
【００６７】
スイッチＳＷ３は、ローカルデコード部１１６と第２複数フレーム格納用メモリ１１３との間の接続を管理するためのスイッチで、参照フレーム制御手段１１０により接続や切断及び切り替え制御される。スイッチＳＷ４は、第２複数フレーム格納用メモリ１１３とエンコード部１１１、第２複数フレーム格納用メモリ１１３とローカルデコード部１１６との間の接続を管理するためのスイッチで、参照フレーム制御手段１１０により接続や切断及び切り替え制御される。
【００６８】
参照フレーム制御手段１１０は、スイッチＳＷ２の接続や切断及び切り替えを所定のフレーム送出順に基づいてエンコード部１１１に入力すべき入力フレームを特定し、エンコード部１１１に正しい入力フレームが入力されるように制御を行う。また、参照フレーム制御手段１１０は、スイッチＳＷ３の接続や切断及び切り替えを管理することにより、ローカルデコード部１１６が生成する参照フレームを、第２複数フレーム格納用メモリ１１３内の、対応する参照フレーム位置に正しく格納することができるように制御を行う。
【００６９】
更に、参照フレーム制御手段１１０は、スイッチＳＷ４の接続や切断及び切り替えを管理することにより、所定のフレーム送出順を適用した場合における、現在符号化しようとしている入力フレームの位置に対して、所定のフレーム送出順では既に送出済みになっている参照フレームを符号化の予測に利用することができるように制御を行う。
【００７０】
ローカルデコード部１１６は、例えば図８に示されるようなＭＰＥＧの符号化装置内に含まれるローカルデコード部１１５のような構成である。すなわち、エンコード部１１１からローカルデコード部１１６に対して出力される符号化フレームは、図８におけるエントロピー符号化器の入力側の量子化器による量子化後の符号化フレームに相当するため、ローカルデコード部１１６は、少なくとも逆量子化器と逆直交変換器を備えている。
【００７１】
また、ローカルデコード部１１６は、エンコード部１１１から出力される符号化フレームを取得すると、逆量子化器により逆量子化を行った後、逆直交変換を行うことで、現在符号化が行われた符号化フレームの復号フレームを得る。ローカルデコード部１１６は、この復号フレームを参照フレームとしてスイッチＳＷ３を介して第２複数フレーム格納用メモリ１１３に出力し、次の入力フレームの符号化に備える。
【００７２】
エンコード部１１１は、例えば図８に示されるようなＭＰＥＧの符号化装置のように、ＩＶＯＰを符号化する場合には、第１複数フレーム格納用メモリ１０６から取得することができる入力フレームに対して、所定の直交変換を行う。ここで、所定の直交変換は、ＭＰＥＧではＤＣＴを採用する。しかし、本実施の形態の画像符号化伝送装置におけるエンコード部１１１、ローカルデコード部１１６、及び画像受信復号化装置のデコード部２１６は、共に同じ直交変換を用いていれば、所定の直交変換がアダマール変換や、ウェーブレット変換などの他の直交変換であっても構わず、特に限定されるものではないことに注意する必要がある。
【００７３】
その後、図８に示した構成のエンコード部１１１内の量子化器により情報量の削減を行う。ＰＶＯＰ、ＢＶＯＰの場合には、第１複数フレーム格納用メモリ１０６から取得することができる入力フレームと、第２複数フレーム格納用メモリ１１３から取得することができる参照フレームを利用して、参照フレームに対して動き補償を行ったフレームと入力フレームとの差分フレームに対して、所定の直交変換を行う。その後、エンコード部１１１内の量子化器により情報量の削減を行う。
【００７４】
エンコード部１１１は、この時点での符号化データをローカルデコード部１１６に対して出力する。更に、量子化後の符号化データに対して、エントロピー符号化を行う。その後、同様にエントロピー符号化されたベクトル情報などの付加情報を、エンコード部１１１内の多重化器により多重化する。エンコード部１１１は、このようにして符号化された符号化フレームを緩衝バッファ１０９に対して出力する。
【００７５】
更に、符号化時の符号量を最小にする必要がある場合には、エンコード部１１１内のバッファメモリを符号量制御器が監視し、少なくとも１つの参照フレームを利用して少なくとも１回の符号化を行い、符号化を行うたびに出力される符号化結果の符号量を監視して、符号量が一番少ない時の符号化結果を符号化フレーム出力とする必要がある。よって、エンコード部１１１内の符号量制御器は、参照フレーム制御手段１１０に対して参照フレーム変更要求を通知することができる。
【００７６】
参照フレーム制御手段１１０は、この参照フレーム変更要求を取得すると、所定のフレーム送出順に基づいて現在符号化している入力フレームよりも既に前に符号化出力している参照フレームの中から別の参照フレームに切り替える。これにより、エンコード部１１１は所定のフレーム送出順に基づいた新たな参照フレームを取得することができる。
【００７７】
この操作を繰り返すことで、複数の参照フレームで符号化した際の符号量を比較することができ、エンコード部１１１は最小の符号量となる参照フレームを特定し、最小の符号量をもつ符号化フレームを緩衝バッファ１０９に対して出力することができる。
【００７８】
例えば、１５フレームを１ＧＯＶとして符号化を行っているものとし、図２に示すような所定のフレーム送出順に基づいて符号化が行われており、現在は伝送される順番の３番目まで符号化が進んでいるものとする。つまり、フレーム番号では、０、８、４、１２番目のフレームまでが既に符号化されて参照フレームとして図１の第２複数フレーム格納用メモリ１１３に格納されているものとする。
【００７９】
通常の符号化であれば、参照フレームは符号化した直後の１フレーム分しか格納せず、それ以前に符号化された際の参照フレームは利用することができない。しかし、本実施の形態であれば、スイッチＳＷ４を制御することで、現在までの符号化によって生成された参照フレーム、すなわち、ここでは第２複数フレーム格納用メモリ１１３に格納されているフレーム番号０、８、４、１２のフレームを参照フレームとして利用することができる。
【００８０】
よって、次の符号化対象であるフレーム番号２（伝送される順番では４）の符号化を行う際に利用する参照フレームは、フレーム番号０、８、４、１２の４つから選択することができる。そこで、これら４つのフレームにおいてそれぞれ符号化を行い、その時の出力符号量をそれぞれ取得して各出力符号量を比較して最小の符号量を出力する参照フレームを特定することができる。
【００８１】
次に、符号化伝送部１０２について説明する。本実施の形態において符号化伝送部１０２は、第３複数フレーム格納用メモリ１１７、符号化ビットストリーム解析手段１１８、フレーム送出順制御手段１１９、スイッチＳＷ６及び伝送パケット送出手段１２１より構成されている。符号化伝送部１０２の各構成要素に関する説明を以下に述べる。
【００８２】
第３複数フレーム格納用メモリ１１７は、エンコード部１１１によって生成された符号化フレームをＧＯＶ分一時的に蓄積している緩衝バッファ１０９からスイッチＳＷ５を介してＧＯＶを取得し、格納することができるメモリを備えている。この第３複数フレーム格納用メモリ１１７のメモリ量は、符号化部１０１の最大ビットレートや、伝送路であるネットワークもしくは格納手段１２２の最大ビットレートから予め算出して確保されている。
【００８３】
符号化ビットストリーム解析手段１１８は、第３複数フレーム格納用メモリ１１７に格納されたＧＯＶのＧＯＶヘッダ及び各符号化フレームのヘッダ情報を解析し、各符号化フレーム毎に所定のフレーム送出順で伝送路によって伝送する際に必要となるＧＯＶ内での符号化フレームの位置を特定するための情報を収集する。
【００８４】
また、符号化ビットストリーム解析手段１１８は、フレーム送出順制御手段１１９に対して、符号化伝送開始要求を行うことにより、第３複数フレーム格納用メモリ１１７に格納されたＧＯＶを伝送路によって伝送するタイミングを管理する。また、収集したＧＯＶ内での符号化フレームの位置を特定するための情報をフレーム送出順制御手段１１９に対して通知する。さらに、フレーム送出順制御手段１１９からＧＯＶ内の符号化フレームの送出が完了したことを取得する手段を備える。
【００８５】
この他に、符号化ビットストリーム解析手段１１８は、第３複数フレーム格納用メモリ１１７からスイッチＳＷ６を介して符号化フレームを取得し、伝送パケット送出手段１２１に対して出力する。その際に、ＧＯＶヘッダ情報内のタイムコード（ｔｉｍｅ＿ｃｏｄｅ）や、符号化フレームのヘッダ情報内のモジュロ・タイムベース（ｍｏｄｕｌｏ＿ｔｉｍｅ＿ｂａｓｅ）、ｖｏｐタイム・インクリメント（ｖｏｐ＿ｔｉｍｅ＿ｉｎｃｒｅｍｅｎｔ）などの情報で修正が必要な場合には、この符号化ビットストリーム解析手段１１８において、これらのヘッダ情報を修正することができる手段を備えていることが望ましい。
【００８６】
フレーム送出順制御手段１１９は、符号化ビットストリーム解析手段１１８によって通知されるＧＯＶ内での符号化フレームの位置を特定するための情報と、所定のフレーム送出順に基づいてスイッチＳＷ６の接続や切断及び切り替えを行うことで、第３複数フレーム格納用メモリ１１７に格納されているＧＯＶ内の符号化フレームの送出順を管理する。ここでは、第３複数フレーム格納用メモリ１１７にはエンコード部１１１で生成された符号化フレームが、生成された順番で入力され、その入力順のまま第３複数フレーム格納用メモリ１１７から符号化フレームを出力するようにスイッチＳＷ６を切り替え制御する。
【００８７】
また、フレーム送出順制御手段１１９は、スイッチＳＷ６の切り替えを行った際に、ＧＯＶ内の符号化フレームが全て送出されたことを検出した場合には、符号化ビットストリーム解析手段１１８に対して、ＧＯＶ内の符号化フレームの送出が完了したことを通知する。
【００８８】
スイッチＳＷ６は、第３複数フレーム格納用メモリ１１７と符号化ビットストリーム解析手段１１８との間の接続を管理するためのスイッチで、フレーム送出順制御手段１１９により接続や切断及び切り替え制御される。伝送パケット送出手段１２１は、符号化ビットストリーム解析手段１１８から符号化フレームを取得し、取得した符号化フレームを少なくとも１つのパケットに分解して、伝送路であるネットワークもしくは格納手段１２２に対して伝送する。
【００８９】
送出する符号化フレームを伝送路に対してパケット化して伝送するための伝送パケット送出手段１２１を備えることで、フレーム送出順制御手段１１９によって特定されたＧＯＶ内の現在伝送しようとしている符号化フレームを、伝送路において効率良く伝送することができるパケットサイズにパケット化して伝送することができる。以上のような構成と各構成要素の機能を備えることで、本発明の画像符号化伝送装置の一実施の形態を実現することができる。
【００９０】
次に、本発明である画像符号化伝送装置及び画像受信復号化装置における動作において、所定のフレーム送出順が重要な要素となることから、この所定のフレーム送出順と、符号化時の参照フレームの選択方法について以下に説明する。
【００９１】
まず、所定のフレーム送出順の決定方法について図２、図３を用いて以下に説明する。図２は、ＧＯＶ内に含まれる符号化フレームが伝送される順番と、ＧＯＶ内に含まれるそれぞれの符号化フレームがどの参照フレームによって符号化されるかを示した図である。図３は本発明において所定のフレーム送出順を求める方法を示したフローチャートである。なお、図３の処理は予め外部で行っておいて、その結果である送出フレーム順に関する情報を参照フレーム制御手段１１０に予め保持して利用できるようにするか、あるいは、参照フレーム制御手段１１０が、符号化伝送装置全体の起動時に行う初期化処理の段階で行う。
【００９２】
まず、符号化フレームのまとまり、つまり、本実施の形態ではＧＯＶに含まれる符号化フレーム数をＭとする。格納されているフレーム番号は０からＭ−１までの整数であると表現することができる。ここで、フレーム番号０を最初の送出フレーム番号とする。以後の説明では、Ｍ＝１５で、ＧＯＶ内で先頭のフレームがＩＶＯＰである以外はすべてＰＶＯＰによってＧＯＶが構成されているものとして話を進める。
【００９３】
最初に、Ａ＝Ｍ、Ｃ＝１に初期化する（ステップＳ１０１）、ここで、Ａは演算の途中で利用する変数である。Ｃはフレーム送出順をいくつ求めたかを計数するためのカウンタ変数である。既に最初に送出するフレーム番号は、０番目のフレームであるものとしており、既に求められたフレーム送出順は、１であることからこのような初期化を行っている。今、Ｍ＝１５であるため、Ａ＝１５，Ｃ＝１で初期化が行われる。
【００９４】
続いて、次式
Ｂ＝［（Ａ＋１）／２］
により変数Ｂを算出する（ステップＳ１０２）。ただし、上式中、［］はガウス記号を示す。続いて、変数ｎを０に初期化した後（ステップＳ１０３）、次式
Ｄ＝Ｂ＋２×Ｂ×ｎ
により変数Ｄを算出する（ステップＳ１０４）。
【００９５】
このステップＳ１０４で求められるＤは、ｎを正の整数であるとすると、初項をＢ、公比を２×Ｂとする等比数列で生成される数列であると考えられる。よって、フレーム０の位置から、Ｂの間隔で次のフレーム伝送順に対応するフレーム番号が求まることになる。
【００９６】
次に、Ｄ＜Ｍであるかを判定する（ステップＳ１０５）。ステップＳ１０５においてＤ＜Ｍと判定された場合は、ステップＳ１０４において求められたフレーム伝送順のフレーム位置が、ＧＯＶ内に存在するフレーム位置に収まっていることを意味する。Ｄ＜Ｍの場合は、ステップＳ１０６に進んで、ステップＳ１０４で求められたＤを次のフレーム伝送順として採用し、この値を次のフレーム送出番号として保存する。
【００９７】
その後、ステップＳ１０７に進み、Ｃ＝Ｃ＋１，ｎ＝ｎ＋１を求める。このステップＳ１０７では、新たにフレーム伝送順が求まったため、カウンタＣを１つ増加させている。また、次の数列を求めるため、ｎを１つ増加させている。ステップＳ１０７の処理が終わると、次にステップＳ１０４に進み次の数列の値が求められる。他方、ステップＳ１０５においてＤ＜Ｍと判定された場合は、ステップＳ１０４において求められたフレーム伝送順のフレーム位置が、ＧＯＶ内に存在するフレーム位置に収まっていないことを意味する。
【００９８】
ここで、ここまでの処理において実際に行われたことを確認することにする。まず、一番初めのフレーム伝送順は、フレーム０から始まっている。次のフレーム伝送順は、初項Ｂ、つまりＢ＝８であるため、フレーム８が次のフレーム伝送順として採用される。その後、公比２×Ｂで増加するため、再度Ｄを計算すると、Ｄ＝８＋２×８×１＝２４となる。この値は、Ｍ＝１５を超えているため、ＧＯＶ内に存在することができないため、Ｃ＞Ｍであるかを判定する（ステップＳ１０８）。
【００９９】
Ｃ＞Ｍである場合には、ＧＯＶ内に含まれる全てのフレームがフレーム伝送順として採用されたことを表していることから、フレーム伝送順を求める処理を終了する。他方、ステップＳ１０８においてＣ＞Ｍでないと判定された場合には、ステップＳ１０９に進み、Ａ＝Ｂ、すなわち、Ａの値をＢの値とする。その後、ステップＳ１０２に戻って処理を継続する。
【０１００】
以上のような処理を行うことで、本発明のフレーム伝送順を求めることができる。参考までに、ステップＳ１０２からの実際の処理をもう少し継続して説明する。現在、ステップＳ１０８で、求められたフレーム伝送順が、フレーム０、フレーム８の順であり、Ａ＝１５，Ｂ＝８、Ｄ＝２４，Ｃ＝２，ｎ＝１である所から処理を継続する。
【０１０１】
この時点ではＣ＞Ｍではないため、ステップＳ１０９に進む。ステップＳ１０９では、Ａの値をＢの値「８」に更新するため、Ａ＝８に更新される。次に、ステップＳ１０２に進み、再度Ｂ＝［（Ａ＋１）／２］を求める。Ａ＝８に更新されたため、Ｂ＝［４.５］＝４と求まる。その後、ステップＳ１０３でｎ＝０にリセットされ、ステップＳ１０４で、新たなＢ＝４を用いてフレーム伝送順を求める。すると、Ｄの値は、４（ｎ＝０のとき）、１２（ｎ＝１のとき）、２０（ｎ＝２のとき）のように変化する（ステップＳ１０４〜Ｓ１０７の繰り返し）。従って、ｎ＝２の時点でＤ＜Ｍではなくなるため、再度ステップＳ１０９でＡの値をＢの値（この時点では４）に更新してから、上記の処理が繰り返される。従って、現在までで、フレーム伝送順は、フレーム０→フレーム８→フレーム４→フレーム１２まで求められたことになる。続きのフレーム伝送順は、図２（Ｃ）の「伝送される順番」に示されるような順番が求められることにより、処理が完了する。
【０１０２】
なお、図２（Ｃ）及び（Ｄ）に示す「伝送される順番」及び「参照するフレーム番号」は、４本の水平線上に数値が記載されているが、これは図示の便宜上で記載したものであって、時間的には左から右へ、かつ、上から下へ推移する。すなわち、図２（Ｃ）に図示されているのは伝送される順番であって、フレーム番号ではなく、最初は０で示す位置の同図（Ｂ）に示すフレーム番号０のフレームが伝送され、次に同図（Ｃ）に１で示す位置の同図（Ｂ）に示すフレーム番号８のフレームが伝送され、続いて同図（Ｃ）に２で示す位置の同図（Ｂ）に示すフレーム番号４のフレームが伝送され、続いて同図（Ｃ）に３で示す位置の同図（Ｂ）に示すフレーム番号１２のフレームが伝送され、以下、同様にしてフレーム番号２、フレーム番号６、フレーム番号１０、・・・の順で各フレームが伝送され、最後に図２（Ｃ）に１４で示す位置の同図（Ｂ）に示すフレーム番号１３のフレームが伝送される。
【０１０３】
次に、本発明のフレーム伝送順に基づいた符号化時の参照フレームの選択方法について説明する。まず、図２（Ａ）に四角で囲んでフレームの種類を模式的に示しており、ＧＯＶ内の符号化フレームの先頭はＩで示すようにＩＶＯＰであり、２番目以降の符号化フレームはＰで示すようにＰＶＯＰによって構成され、また、ＧＯＶは０.５秒分、ＧＯＶに含まれる符号化フレームは１５フレームであるものとする。また、符号化フレームが伝送される順番は、図２（Ｃ）に示されるような、本実施の形態における所定のフレーム送出順であるものとする。
【０１０４】
ＧＯＶ内で初めに符号化されるフレームは、ＧＯＶの先頭フレームであるフレーム０である。このフレーム０がＩＶＯＰとして符号化される。次に、図２（Ｄ）の「伝送される順番」で示すように、フレーム番号８のフレーム（フレーム８）が次に符号化されるべきフレームであることが分かる。
【０１０５】
従来、このフレーム８を符号化しようとした場合には、あるサンプル間隔で入力された入力フレームの中のフレームの１つが、このフレーム８である。このフレーム８は、ＰＶＯＰとして符号化する必要があるため、通常の参照フレームは、このフレーム８よりも１つ前に符号化したフレームの参照フレームを利用することになる。
【０１０６】
つまり、従来法では、現在符号化しようとしているフレームがフレーム８で、１つ前に符号化したフレームがフレーム０である場合には、符号化を行う際のサンプル間隔が８であることになる。このことは、それ以降のフレームを符号化する際にもサンプル間隔が８で符号化しなければならず、この場合のフレームレートは、２［ｆｐｓ］となってしまう。
【０１０７】
これに対し、本実施の形態では、現在符号化しようとしているフレームよりも前に符号化したフレームの参照フレームであれば利用することが可能であるため、図２（Ｄ）に示すように、フレーム８はフレーム０を参照して符号化される。続いて、３番目に伝送されるフレーム４は、フレーム０及びフレーム８を参照することが可能であるが、ここでは図２（Ｄ）に示すように、フレーム０を参照している。更に、４番目に伝送されるフレーム１２は、フレーム０、８、４を参照することが可能であるが、ここでは図２（Ｄ）に示すように、最もフレーム１２に近いフレーム８を参照している。
【０１０８】
このように、本実施の形態では、図２（Ｄ）に示すように、伝送される順番に従い、しかもＧＯＶ内の既に符号化されたフレーム位置の参照フレームの中から、一番時間的に近い参照フレームを利用することが可能である。
【０１０９】
更に、本発明は所定のフレーム送出順に従って符号化を行うため、符号化の際のフレーム間隔が上述した従来法で示されたような現象は発生せず、徐々にフレーム間隔が短くなるように符号化が進む。このことにより、復号側でのフレームレートは、本発明の所定のフレーム送出順と参照フレームの選択方法によって、復号することができたフレーム数の分だけ向上することが可能になる。
【０１１０】
以上のようなフレーム伝送順の求め方を採用することで、このようなフレーム伝送順で伝送路を介して符号化フレームを伝送し、画像受信復号化装置で伝送された順番で復号することが可能なように画像符号化伝送装置によって符号化を行った場合に新たな効果が生じることになる。
【０１１１】
図２のＧＯＶ構成の例では、初めのうちは、ＧＯＶに含まれるフレーム数Ｍ＝１５の約半分の８フレーム分の時間間隔で再生フレームが復号される。その後、復号されるフレームの間隔が更に半分の８／２＝４フレーム分になり、最終的には復号されるフレームの間隔が１フレーム分になり、本来のフレームレートとなる。
【０１１２】
ここで、仮に伝送路の伝送ビットレートが急激に低下して、ＧＯＶに含まれる符号化フレームのうち、半分の量しか伝送されなかった場合を考える。上記のように１ＧＯＶは０.５秒分で、ＧＯＶに含まれるフレーム数Ｍは、Ｍ＝１５であり、また、復号側ではＧＯＶ１５フレームのうち８フレーム分しか受信することができなかったものとする。
【０１１３】
従来法では、フレームレートはそのままの３０［ｆｐｓ］であるため、受信することができた８フレーム分、つまり約０.２５秒分の画像は滑らかな動画像表示が行われるが、その後の残り約０.２５秒分は、表示するフレームが存在しないため、動画像表示を約０.２５秒間停止させてしまうことになる。このことは、後半の０.２５秒間に動画像表示内で起こった内容が完全に欠落してしまうことを意味する。このような現象が生じてしまうと、順にＶＯＤ（Video On Demand）や監視カメラなどで映像の配信を行っていたとすると、映像中の重要な情報を見逃してしまう可能性が高まり、映像品質上重要な影響を与えることになる。
【０１１４】
これに対し、本実施の形態では、仮にＧＯＶ内の前半８フレームだけしか受信することができなかったとしても、本実施の形態のフレーム伝送順に基づいた符号化により、復号時にフレームレートが自動的に減少し、しかもＧＯＶ単位でできる限り等間隔で再生表示ができるようにＧＯＶ内の符号化フレームを構成することができるため、本来であれば０.２５秒分の動画像表示しかできないところ、本実施の形態ではフレームレートが低下した分、映像表示の滑らかさは低下して、フレーム間で起こった映像の内容は欠落する可能性が生じることはあるが、映像の内容が完全に欠落してしまうということは回避できる。
【０１１５】
この効果は、伝送路の伝送ビットレートが頻繁に変動するような環境で符号化フレームの伝送を行った場合において更に顕著である。従来法では、頻繁に伝送レートが変動した場合には、映像情報が細切れに再生されることになり、映像全体の内容を把握することが極端に困難になってくる。しかし、本実施の形態のように、フレームレートが受信フレームに応じて動的に変化することにより、映像全体の内容を把握することはさほど気にすることなく再生され、しかもフレームレートの極端な変動も回避することができるため、このような伝送レートが頻繁に変動するような環境に対してはかなり有効な手法であることが分かる。
【０１１６】
また、以上の効果とは別の新たな効果も生じる。伝送路によって伝送される符号化フレームは、予め符号化されたビットストリームであった場合を考える。従来法で符号化されたビットストリームであれば、復号時のフレームレートは、符号化時に決定されてしまう。よって、必要に応じてフレームレートを制御仕様とした場合には、何らかのトランスコーディング手段が必要となる。
【０１１７】
一方、本実施の形態であれば、一度作成したビットストリームにおいて、ＧＯＶ単位で復号し、ある位置で復号を止めることで、今まで復号したフレーム数に応じたフレームレートでＧＯＶを再生表示することが可能となる。更に、本実施の形態においては、画像受信復号化装置側のデコード能力が低い場合であっても、同様の理由から、デコードできるフレーム数に応じてそれなりのフレームレートで再生表示を行うことが可能となる。
【０１１８】
次に、本発明である画像符号化伝送装置における動作について、図１の本発明の画像符号化伝送装置の一実施の形態のブロック図と、図４、図５、図６の各フローチャートを参照して以下に説明する。図４は図１における符号化器１０１の動作を表すフローチャートである。
【０１１９】
以下に、図１における符号化器１０１の動作を説明する。入力手段１０３により画像入力が開始されると、符号化器１０１は入力手段１０３から取得した入力フレームを緩衝バッファ１０４に格納する（図４のステップＳ２０１）。この時点で、ＧＯＶを構成するために必要なフレーム数が既に緩衝バッファ１０４に格納されているものとする。
【０１２０】
ステップＳ２０２で、入出力フレーム管理手段１０７は、緩衝バッファ１０４にＧＯＶを構成するために必要なフレーム数が十分格納されているかを判定する。緩衝バッファ１０４に十分フレームが格納されている場合には、ステップＳ２０３に進む。緩衝バッファ１０４に十分フレームが格納されていない場合には、端子▲３▼に進み、符号化部１０１の処理を終了する。
【０１２１】
ステップＳ２０３で、入出力フレーム管理手段１０７はスイッチＳＷ１を接続する。入出力フレーム管理手段１０７によってスイッチＳＷ１が接続されると、緩衝バッファ１０４から第１複数フレーム格納用メモリ１０６に対して、ＧＯＶを構成するために必要なフレーム数を格納することができるようになり、第１複数フレーム格納用メモリ１０６は緩衝バッファ１０４から入力される入力フレームを、ＧＯＶを構成するために必要なフレーム数分格納する（図４のステップＳ２０４）。
【０１２２】
第１複数フレーム格納用メモリ１０６にＧＯＶを構成するために必要なフレーム数を十分格納すると、入出力フレーム管理手段１０７はスイッチＳＷ１とＳＷ５を切断する（図４のステップＳ２０５）。続いて、入出力フレーム管理手段１０７は、参照フレーム制御手段１１０に対して符号化開始要求を通知する（図４のステップＳ２０６）。その後、参照フレーム制御手段１１０は、スイッチＳＷ２とスイッチＳＷ３をＧＯＶの先頭フレームが取得できるような位置に接続し（図４のステップＳ２０７）、更に、スイッチＳＷ４を切断する（図４のステップＳ２０８）。
【０１２３】
続いて、参照フレーム制御手段１１０は、エンコード部１１１に対して符号化開始要求を通知する（図４のステップＳ２０９）。すると、エンコード部１１１は、第１複数フレーム格納用メモリ１０６から、スイッチＳＷ２を介して入力フレームを取得して（図４のステップＳ２１０）、その入力フレームを符号化する（図４のステップＳ２１１）。そして、エンコード部１１１は、符号化フレームを緩衝バッファ１０９に出力する（図４のステップＳ２１２）。また、ローカルデコード部１１６が、エンコード部１１１から符号化フレームを取得する（図４のステップＳ２１３）。
【０１２４】
ローカルデコード部１１６は、取得した符号化フレームを復号化して参照フレームを生成し（図４のステップＳ２１４）、その参照フレームをスイッチＳＷ３を介して第２複数フレーム格納用メモリ１１３に格納する（図４のステップＳ２１５）。次に、参照フレーム制御手段１１０は、所定のフレーム送出順に基づいた現在の符号化フレーム位置から、ＧＯＶ内の全ての符号化フレームが揃ったかどうか判定する（図４のステップＳ２１６）。全ての符号化フレームが揃った場合には、端子▲１▼に対応するステップＳ２２６に進む。全ての符号化フレームが揃っていない場合には、ステップＳ２１７に進む。
【０１２５】
全ての符号化フレームが揃っていない場合には、参照フレーム制御手段１１０は、所定のフレーム送出手段に基づきスイッチＳＷ２、スイッチＳＷ３、スイッチＳＷ４をＧＯＶの次のフレームが取得できるような位置に切り替える。その後、参照フレーム制御手段１１０は、エンコード部１１１に対して符号化開始要求を通知する（図４のステップＳ２１８）。
【０１２６】
エンコード部１１１は、上記の符号化開始要求に基づき、第１複数フレーム格納用メモリ１０６からスイッチＳＷ２を介して入力フレームを取得すると共に、第２複数フレーム格納用メモリ１１３からスイッチＳＷ４を介して参照フレームを取得する（図４のステップＳ２１９）。その後、エンコード部１１１は、取得した入力フレームと参照フレームにより符号化を行い（図４のステップＳ２２０）、得られた符号化フレームを緩衝バッファ１０９に出力する（図４のステップＳ２２１）。
【０１２７】
次に、ローカルデコード部１１６は、エンコード部１１１から符号化フレームを取得すると共に、第２複数フレーム格納用メモリ１１３からスイッチＳＷ４を介して参照フレームを取得する（図４のステップＳ２２２）。続いて、ローカルデコード部１１６は、取得した符号化フレームと参照フレームにより復号化を行って参照フレームを生成し（図４のステップＳ２２３）、その参照フレームをスイッチＳＷ３を介して第２複数フレーム格納用メモリ１１３に格納する（図４のステップＳ２２４）。
【０１２８】
続いて、参照フレーム制御手段１１０は、所定のフレーム送出順に基づいた現在の符号化フレーム位置から、ＧＯＶ内の全ての符号化フレームが揃ったかどうか判定する（図４のステップＳ２２５）。全ての符号化フレームが揃った場合には、ステップＳ２２６に進む。全ての符号化フレームが揃っていない場合には、ステップＳ２１７に進み、再び前述したステップＳ２１７〜Ｓ２２４の処理を行う。
【０１２９】
参照フレーム制御手段１１０は、ステップＳ２１６又はステップＳ２２６で、ＧＯＶ内の全ての符号化フレームが揃ったと判定した場合は、入出力フレーム管理手段１０７及びエンコード部１１１に対してＧＯＶ符号化停止要求を通知し（図４のステップＳ２２６）、スイッチＳＷ２を切断する（図４のステップＳ２２７）。一方、入出力フレーム管理手段１０７は、スイッチＳＷ５を接続する（図４のステップＳ２２８）。これにより、緩衝バッファ１０９からスイッチＳＷ５を介して、符号化伝送部１０２内の第３複数フレーム格納用メモリ１１７にＧＯＶ内の全ての符号化フレームが格納される（図４のステップＳ２２９）。その後、端子▲２▼に対応するステップＳ２０２に進む。
【０１３０】
以上のような処理を行うことで、本実施の形態の画像符号化伝送装置の動作が行われ、従来法における各種の課題を解決することが可能となる。
【０１３１】
次に、図１の符号化伝送部１０２の動作について、図５及び図６のフローチャートを参照して説明する。まず、符号化部１０１内の緩衝バッファ１０９からＧＯＶが取得できる状態であるかどうか判定し（図５のステップＳ３０１）、取得できる場合には第３複数フレーム格納用メモリ１１７は緩衝バッファ１０９からＧＯＶを取得する（図５のステップＳ３０２）。続いて、符号化ビットストリーム解析手段１１８は、第３複数フレーム格納用メモリ１１７からＧＯＶ内のヘッダ情報を取得して（図５のステップＳ３０３）、そのヘッダ情報を解析し（図５のステップＳ３０４）、フレーム送出順制御手段１１９に対して符号化伝送開始要求を行う（図５のステップＳ３０５）。
【０１３２】
これにより、フレーム送出順制御手段１１９は、所定のフレーム送出順に基づき、スイッチＳＷ６を切り替え（図５のステップＳ３０６）、第３複数フレーム格納用メモリ１１７に格納されている符号化フレームを、所定のフレーム送出順に符号化ビットストリーム解析手段１１８に供給させる（図５のステップＳ３０７）。符号化ビットストリーム解析手段１１８は、入力された符号化フレームに対して必要なヘッダ情報を追加・修正して（図５のステップＳ３０８）、符号化フレームをヘッダ情報と共に伝送パケット送出手段１２１へ出力する（図５のステップＳ３０９）。以上のステップＳ３０５〜Ｓ３０９の動作は、ＧＯＶ内の符号化フレームがすべて処理されるまで繰り返される（図５のステップＳ３１０）。
【０１３３】
次に、伝送パケット送出手段１２１は、符号化ビットストリーム解析手段１１８から符号化フレームを取得できるかどうか判定し（図６のステップＳ４０１）、取得できる場合には符号化フレームを取得する（図６のステップＳ４０２）。そして、伝送パケット送出手段１２１は、取得した符号化フレームをパケット化し（図６のステップＳ４０３）、そのパケット化したデータを、伝送路であるネットワーク若しくは格納手段１２２へ出力する（図６のステップＳ４０４）。最後に、伝送パケット送出手段１２１は、パケット化したデータを全てネットワーク若しくは格納手段１２２へ出力した時点で処理を終了する（図６のステップＳ４０５、Ｓ４０４）。
【０１３４】
次に、本発明の画像符号化伝送装置から伝送路を介して伝送される符号化ビットストリームを受信して復号化する画像受信復号化装置の一例について説明する。図７は上記の画像受信復号化装置の一例のブロック図を示す。同図に示すように、画像受信復号化装置は、図１に示した画像符号化伝送装置からネットワーク又は格納手段２０３（１２２）を介して少なくとも１つのパケットを受信し、少なくとも１つのパケットから符号化フレームを再構成すると共に、ＧＯＶを再構成するための復号化受信部２０１と、再構成されたＧＯＶを復号し、少なくとも１つの参照フレームを利用して復号化することができる復号化部２０２と、復号化部２０２からの画像信号を１フレーム分格納する表示用フレームメモリ２１９と、表示用フレームメモリ２１９からの画像信号を画像表示する表示装置２２０とからなる。
【０１３５】
図７において、復号化受信部２０１及びスイッチＳＷ１２及び緩衝バッファ２１１が図１０に示した多重化分離器４０２を拡張した構成部であり、第２複数フレーム格納用メモリ２１３が図１０に示した参照予測メモリ４０６を拡張した構成であり、それ以外の図１０の構成部を拡張した構成がデコード部２１６、参照フレーム制御手段２１２、スイッチＳＷ１３〜ＳＷ１５及び入出力フレーム管理手段２０９である。
【０１３６】
次に、復号化受信部２０１の構成及び動作について説明する。復号化部受信部２０１は、伝送パケット受信手段２０４、符号化ビットストリーム解析手段２０５、フレーム格納順制御手段２０６、スイッチＳＷ１１及び第１複数フレーム格納用メモリ２０８から構成されている。まず、ネッワーク又は格納手段２０３から入力された少なくとも１つのパケットは、伝送パケット受信手段２０４により受信されて符号化フレームが再構成された後、符号化ビットストリーム解析手段２０５に供給される。
【０１３７】
符号化ビットストリーム解析手段２０５は、入力されたヘッダ情報を解析し、ＧＯＶを再構成するためにフレーム格納順制御手段２０６に対してスイッチＳＷ１１の切り替え制御を要求する。これにより、フレーム格納順制御手段２０６は、所定のフレーム格納順に基づき、スイッチＳＷ１１を切り替え、符号化ビットストリーム解析手段２０５からの符号化フレームを第１複数フレーム格納用メモリ２０８に供給して格納させると共に、ヘッダ情報も第１複数フレーム格納用メモリ２０８に格納させる。
【０１３８】
第１複数フレーム格納用メモリ２０８は、伝送パケット受信手段２０４及び符号化ビットストリーム解析手段２０５によって再構成された符号化フレームを格納するためのメモリで、少なくとも１ＧＯＶ分の符号化フレームを格納することができるメモリ容量を有している。このメモリ容量は、符号化部の最大のビットレートや、伝送路の最大ビットレートから予め算出して確保する。
【０１３９】
次に、復号化部２０２の構成及び動作について説明する。復号化部２０２は、入出力フレーム管理手段２０９、スイッチＳＷ１２〜ＳＷ１５、緩衝バッファ２１１、参照フレーム制御手段２１２、第２複数フレーム格納用メモリ２１３、デコード部２１６、緩衝バッファ２１８から構成されている。スイッチＳＷ１３及びスイッチＳＷ１５は、何れも参照フレーム制御手段２１２により、第２複数フレーム格納用メモリ２１３に格納されている複数フレームの画像信号の中から一のフレームの画像信号を選択する点で同じであるが、スイッチＳＷ１３は選択したフレームの画像信号をデコード部２１６へ出力し、スイッチＳＷ１５は選択したフレームの画像信号を緩衝バッファ２１８へ出力する点で異なる。
【０１４０】
また、上記の入出力フレーム管理手段２０９は、復号化をＧＯＶ単位で制御するために、第１複数フレーム格納用メモリ２０８から緩衝バッファ２１１に対する符号化フレームの格納を、スイッチＳＷ１２の接続や切断及び切り替えを行うことによって管理する。また、ＧＯＶ単位の符号化を開始する要求を参照フレーム制御手段２１２に通知することにより、ＧＯＶ単位で符号化を開始するタイミングを管理する。
【０１４１】
これにより、第１複数フレーム格納用メモリ２０８に全てのＧＯＶ内の符号化フレームが格納されると、入出力フレーム管理手段２０９の制御により復号化部２０２内のスイッチＳＷ１２が接続され、第１複数フレーム格納用メモリ２０８に格納されているＧＯＶ内の符号化フレームが順次ＧＯＶ単位で緩衝バッファ２１１に一時記憶される。
【０１４２】
続いて、入出力フレーム管理手段２０９は、参照フレーム制御手段２１２に対して復号化開始要求を通知する。すると、参照フレーム制御手段２１２は、スイッチＳＷ１３とスイッチＳＷ１４をＧＯＶの先頭の復号化フレームが取得できるような位置に接続し、スイッチＳＷ１２を切断する。
【０１４３】
続いて、参照フレーム制御手段２１２は、デコード部２１６に対して復号化開始要求を通知する。ここで、デコード部２１６は、例えば図１０に示したＭＰＥＧの復号化装置のエントロピー復号化器４０３、逆量子化器４０４、逆直交変換器４０５、加算器４０８、動き補償予測器４０７を少なくとも備える構成である必要がある。図１０に示された参照予測メモリ４０６は、図５中の第２複数フレーム格納用メモリ２１３により構成される。
【０１４４】
デコード部２１６は、緩衝バッファ２１１から符号化フレームを取得すると、内部のエントロピー復号化器によりエントロピー復号を行い、その後、逆量子化器により逆量子化し、続いて、逆直交変換が行われた結果と、第２複数フレーム格納用メモリ２１３からスイッチＳＷ１３を介して参照フレームの画像信号を受ける動き補償予測器からの出力から、現在符号化が行われた符号化フレームの復号フレームを得る。
【０１４５】
参照フレーム制御手段２１２は、スイッチＳＷ１４を切り替え制御して、デコード部２１６により復号化された参照フレームをスイッチＳＷ１４を介して第２複数フレーム格納用メモリ２１３の対応する参照フレーム位置に正しく格納させ、次の符号化フレームの復号に備える。
【０１４６】
第２複数フレーム格納用メモリ２１３は、デコード部２１６が生成した参照フレームを、ＧＯＶ内の対応するフレーム位置に格納することができるように、少なくともＧＯＶを構成する際に必要なフレーム数を格納することができるフレームメモリを備えている。例えば、入力画像のフレームレートが３０［ｆｐｓ］、１ＧＯＶが０.５秒、１ＧＯＶ内に格納される符号化フレーム数が１５である場合には、第２複数フレーム格納用メモリ２１３は少なくとも１５フレーム分の参照フレームを格納することができるメモリ容量を確保する。
【０１４７】
次に、参照フレーム制御手段２１２は、所定のフレーム送出順に基づいた現在の復号化フレーム位置から、ＧＯＶ内の全ての復号化フレームが第２複数フレーム格納用メモリ２１３内に揃ったかどうか判定し、全ての復号化フレームが揃っていない場合には、所定のフレーム送出手段に基づきスイッチＳＷ１４をＧＯＶの次の復号化フレームが取得できるような位置に切り替える。
【０１４８】
以上の動作を繰り返し、参照フレーム制御手段２１２は、ＧＯＶ内の全ての符号化フレームがデコード部２１６で復号化されて第２複数フレーム格納用メモリ２１３に揃ったと判定した場合は、デコード部２１６に対してＧＯＶ復号化停止要求を通知すると共に、スイッチＳＷ１４を切断する。一方、入出力フレーム管理手段２０９は、スイッチＳＷ１５を接続する。
【０１４９】
これにより、第２複数フレーム格納用メモリ２１３からスイッチＳＷ１５を介して緩衝バッファ２１８へＧＯＶ内の全ての画像信号（復号された参照フレーム）が供給されて一時格納された後、表示用フレームメモリ２１９に供給されて一時記憶された後、表示装置２２０により画像表示される。
【０１５０】
なお、本発明は以上の実施の形態に限定されるものではなく、例えば、図１において、エンコード部１１１が生成した符号化フレームを緩衝バッファ１０９に１ＧＯＶ分格納した段階で、フレーム番号０、１、２、・・・、１４という、本来のフレーム番号順で符号化フレームを第３複数フレーム格納用メモリ１１７に格納し、フレーム送出順制御手段１１９がエンコード部１１１が符号化した順番（図２（Ｃ）に示した伝送される順番）で第３複数フレーム格納用メモリ１１７から対応するフレーム番号の符号化フレームを読み出すように、スイッチＳＷ６を切り替え制御するようにしてもよい。ただし、この場合は、図３に示したフローチャートによる送出フレーム順を決定する送出フレーム順決定手段が、予め例えば符号化部１０１内に設けられ、この送出フレーム順決定手段により決定された送出フレーム順を、参照フレーム制御手段１１０とフレーム送出順制御手段１１９の両方が参照できるようにする必要がある。
【０１５１】
なお、第１複数フレーム格納用メモリ１０６は、１つの画像フレームを格納することができるメモリ容量であってもよい。
【０１５２】
また、ＧＯＶ内の符号化フレームの符号化方法と伝送順を考慮することで、伝送路の伝送ビットレートが低下した場合であっても、生成したＧＯＶが一定時間で伝送することができなければ、伝送できなかった残りのＧＯＶを破棄することで、伝送する符号化フレーム数を制限し、ＧＯＶ内の符号化フレームの符号化方法と伝送順を考慮することで、画像受信復号化装置側で受信したＧＯＶ内に含まれる符号化フレーム数に応じてフレームレートが自動的に変化し、できる限り等間隔でフレームが再生表示できるようにすることにより、伝送路の伝送ビットレートを常に監視することなく符号化ビットレートを制御するようにしてもよい。
【０１５３】
更に、ＧＯＶ内の符号化フレームの符号化方法と伝送順を考慮することで、伝送路の伝送ビットレートが低下した場合であっても、生成したＧＯＶが一定時間で伝送することができなければ、伝送できなかった残りのＧＯＶを破棄することで、伝送する符号化フレーム数を制限し、ＧＯＶ内の符号化フレームの符号化方法と伝送順を考慮することで、画像受信復号化装置側で受信したＧＯＶ内に含まれる符号化フレーム数に応じてフレームレートが自動的に変化し、できる限り等間隔でフレームが再生表示できるようにしてもよく、この場合は、伝送路の伝送ビットレートを常に監視して伝送ビットレートの変動を検出するまでにかかる一定の時間差による伝送遅延の拡大や、ＧＯＶの復号に与える影響を軽減し、従来よりも高品質の画像伝送を行うことができる。
【０１５４】
【発明の効果】
以上説明したように、本発明によれば、エンコード部の符号化時には、第２のフレーム格納用メモリに格納されている、現在符号化しようとしているフレームよりも前に符号化したフレームの中から任意のフレームを参照フレームとして利用するようにしているため、時間的に一番近い参照フレームを利用することができ、よって、伝送レートの変動によるパケットの欠落によりＧＯＶ内の情報を完全に再構成することが難しいような伝送レートの変動が激しい伝送路に対しても、所定の送出フレーム順で符号化ビットストリームの送出ができる。
【０１５５】
また、本発明によれば、デコード時間までに伝送されたＧＯＶの情報を利用して、デコード時間までに蓄積されたＧＯＶ内のフレーム数によって復号側のフレームレートが自動的に変化し、できる限り等間隔でフレームが再生表示できるような所定の送出フレーム順で符号化ビットストリームの画像受信符号化装置への送出ができるため、伝送レートの変動が激しい伝送路へ符号化ビットストリームを送出する場合でも、画像の内容が完全に欠落することがなく、画像受信符号化装置側で画像全体の内容の把握が可能な伝送ができる。
【０１５６】
また、本発明によれば、できる限り等間隔でフレームが再生表示できるように所定のフレーム送出順で符号化フレームを伝送するようにしたため、伝送路の伝送ビットレートを常に監視することなく符号化ビットレートを制御することができ、更に、本発明によれば、伝送路の伝送ビットレートを常に監視して伝送ビットレートの変動を検出するまでにかかる一定の時間差による伝送遅延の拡大や、ＧＯＶの復号に与える影響を軽減することができる。以上より、本発明によれば、従来よりも高品質の画像伝送を実現することができる。
【図面の簡単な説明】
【図１】本発明の画像符号化伝送装置の一実施の形態のブロック図である。
【図２】本発明の画像符号化伝送装置によるフレーム送出順及び符号化時における参照フレームの利用方法の一例を説明するための図である。
【図３】本発明のフレーム送出順を求めるための処理の一実施の形態を表すフローチャートである。
【図４】本発明の画像符号化伝送装置における符号化部の動作説明用フローチャートである。
【図５】本発明の画像符号化伝送装置における符号化伝送部の動作説明用フローチャート（その１）である。
【図６】本発明の画像符号化伝送装置における符号化伝送部の動作説明用フローチャート（その２）である。
【図７】本発明の画像符号化伝送装置により出力される符号化フレームを復号する画像受信復号化装置の一例のブロック図である。
【図８】従来のＭＰＥＧ技術による符号化装置の一例のブロック図である。
【図９】ＭＰＥＧ−４におけるＧＯＶヘッダを表現した図である。
【図１０】従来のＭＰＥＧ技術による復号化装置の一例のブロック図である。
【図１１】従来の画像符号化伝送装置の一例のブロック図である。
【図１２】従来の画像受信復号化装置の一例のブロック図である。
【符号の説明】
１０１符号化部
１０２符号化伝送部
１０３入力手段（入力画像）
１０４、１０９、２１１、２１８緩衝バッファ
１０６、２０８第１複数フレーム格納用メモリ
１０７、２０９入出力フレーム管理手段
１１０、２１２参照フレーム制御手段
１１１エンコード部
１１３、２１３第２複数フレーム格納用メモリ
１１６ローカルデコード部
１１７第３複数フレーム格納用メモリ
１１８、２０５符号化ビットストリーム解析手段
１１９フレーム送出順制御手段
１２１伝送パケット送出手段
１２２、２０３ネットワークもしくは格納手段
２０１復号化受信部
２０２復号化部
２０４伝送パケット受信手段
２０６フレーム格納順制御手段
２０９入出力フレーム管理手段
２１６デコード部
２１９表示用フレームメモリ
２２０表示装置
ＳＷ１〜ＳＷ６、ＳＷ１１〜ＳＷ１４スイッチ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image encoding transmission apparatus, and in particular, transmits an encoded bit stream obtained by encoding an arbitrary frame as a reference frame when encoding an image signal through a transmission path using a storage medium or a network. The present invention relates to an image coding and transmission apparatus using an arbitrary reference frame.
[0002]
[Prior art]
In recent years, video distribution systems capable of accessing a moving image to be viewed via a computer network have become widespread. This video distribution system is mutually connected to a plurality of networks, and a plurality of servers for distributing video information are connected to each network. A plurality of video reception / playback terminal devices (hereinafter simply referred to as terminal devices) to which video information is distributed are connected to each network.
[0003]
In the above video distribution system, when the terminal device distributes video information from the server, the terminal device first contacts the server and establishes a line connection state between the terminal device and the server. Thereafter, the terminal device receives desired video information provided from the server and reproduces this video information.
[0004]
In such a video distribution system, generally transmitted video information is often multimedia information created by combining digitized audio, images, and other data. The information amount of such video information (multimedia information) is generally 1 to 2 bytes for the information amount of character information when the data is simply digitized and converted. In the case of voice, the telephone quality is 64 kbps. In the case of a moving image, an information amount of 100 Mbps or more is required at the current terrestrial television broadcast quality. It is not practical to handle such a large amount of digital information as it is in view of the cost of using the current transmission path and recording medium.
[0005]
Therefore, an encoding technique for compressing such an enormous amount of information is currently being actively researched and developed. MPEG (Moving Picture Experts Group) is a representative example of video encoding technology. MPEG is an international standard for moving image data encoding technology. Currently, this MPEG standard makes it possible to cope with various applications such as communication and broadcasting such as MPEG-1 and digital broadcasting mainly for storage media of about 1.5 Mbps such as video CD. MPEG-2, MPEG-4 capable of realizing more advanced multimedia contents by object coding and control thereof, and MPEG-7 for content description are defined.
[0006]
By applying such an encoding technique, video data having an enormous amount of information can be efficiently encoded and the amount of information can be compressed to about 1/20 to 1/40. In addition, there are other encoding methods that introduce new encoding methods such as H.264, JPEG2000 (Joint Photographic Experts Group 2000), which are currently in the process of formulating standards. More effective information encoding can be performed.
[0007]
On the other hand, MPEG is configured so that more effective compression can be achieved by combining a plurality of technologies. FIG. 8 is a block diagram showing an example of a conventional MPEG encoding apparatus. This encoding apparatus is an apparatus that encodes image information by MPEG. By taking the difference between the input code 301 and the image signal obtained by decoding the input code 301 by the motion compensation predictor 305, the time of the input code 301 is obtained. Redundant components included in the axial direction are reduced.
[0008]
MPEG has three prediction modes: intra-frame (Intra) coding, forward (Predictively) prediction coding, and bidirectional (Bidirectional) prediction coding. Among these, in the intra-frame encoding, encoding is performed using only the information in the frame without using the output of the motion compensation predictor 305. A frame encoded in this mode is called an I picture or IVOP (Intra Video Object Plane). The IVOP can be decoded without depending on other frames at the time of decoding.
[0009]
In the forward predictive coding, a motion-compensated predictor 305 compensates for a previously coded frame and predicts the current frame. The difference frame is converted by the orthogonal transformer 303. For the orthogonal transform performed by the orthogonal transformer 303, DCT (Discrete Cosine Transform) is used in MPEG. Orthogonal transformation can also be performed using other transformation bases such as Hadamard bases and wavelet bases. A frame encoded in this mode is called a P picture or PVOP. Since PVOP depends on other frames, it cannot be decoded independently like IVOP, but the compression rate can be higher than that of IVOP.
[0010]
In bi-directional predictive coding, the motion-compensated predictor 305 compensates past and future coded frames and predicts the current frame. The difference frame is converted by the orthogonal transformer 303. A frame encoded in this mode is called a B picture or BVOP. Since BVOP depends on other frames in the past and future, it cannot be decoded independently like IVOP and PVOP, but the compression rate can be further increased than IVOP and PVOP.
[0011]
Each VOP is subjected to prediction processing for each 16 pixels × 16 pixels MB (macroblock). The prediction direction differs depending on IVOP, PVOP, and BVOP. IVOP encodes all MBs independently. PVOP has two modes: a mode in which encoding is performed by prediction from a past frame, and a mode in which the MB is independently encoded without prediction. In addition, BVOP has four modes in which the MB is independently encoded without prediction from the future, prediction from the past, prediction from both, and prediction.
[0012]
The motion estimator 304 performs pattern matching on the motion region of the input code for each MB, and detects a motion vector with 0.5 pixel accuracy. The motion compensation predictor 305 uses the motion vector detected by the motion estimator 304 to make a shift by the amount of motion and performs prediction. The motion vector has a horizontal direction and a vertical direction, and is transmitted as additional information of the MB together with an MC (Motion Compensation) mode indicating where the prediction is from.
[0013]
The difference image signal obtained by subtracting the input code 301 for one frame read from the frame memory 302 and the motion vector from the motion compensation predictor 305 by the subtractor 314 is supplied to the orthogonal transformer 303. Here, orthogonal transformation is performed. This orthogonal transform uses DCT in MPEG. DCT is an orthogonal transformation that discretely transforms an integral transformation based on a cosine function into a finite space. In MPEG, two-dimensional DCT is performed on an 8 × 8 DCT block obtained by dividing MB into four. In general, since a video signal has many low frequency components and few high frequency components, the coefficients can be concentrated on the low frequency components by performing DCT, and the subsequent quantizer 306 can efficiently reduce the amount of information. .
[0014]
The DCT coefficient obtained by DCT in the orthogonal transformer 303 is quantized by the quantizer 306. This quantization is based on a value obtained by multiplying a value obtained by weighting an 8 × 8 two-dimensional frequency called a quantization matrix with a visual characteristic and a value called a quantization scale for multiplying the whole by a scalar, and using a quantized value as a DCT coefficient. Divide by the digitized value. When inverse quantization is performed by the decoder, a value close to the original DCT coefficient is obtained by multiplying by the quantized value.
[0015]
The quantized data is encoded by the entropy encoder 310. In general, the entropy encoder 310 is a variable length code (VLC) device, and performs variable length coding (VLC) on quantized data. Of the quantized values, the direct current (DC) component uses DPCM (Differential Pulse Code Modulation) which is one of predictive coding. Moreover, the alternating current (AC) component among the quantized values is zigzag scanned from the low range to the high range, and the run length of zero and the effective coefficient value are set as one event, and the code length is changed from the one having a high appearance probability. Huffman coding that assigns short codes is performed.
[0016]
The entropy encoder 310 encodes not only the quantized data but also auxiliary information related to the motion vector output from the motion estimator 304 and the motion compensated predictor 305 according to a predetermined condition, and entropy-encodes each. Data is stored in the temporary buffer memory 312 through the multiplexer 311 and output as encoded data (output code) 317 at a predetermined transfer rate. Further, the code amount for each macroblock of the output code 317 is notified to the code amount controller 313. The code amount controller 313 notifies the quantizer 306 of the error code amount between the notified code amount and the target code amount. The quantizer 306 can control the code amount by adjusting the quantization scale based on the notified error code amount information.
[0017]
The quantized data output from the quantizer 306 is inversely quantized by the inverse quantizer 307 and further inversely orthogonally transformed by the inverse orthogonal transformer 308 using a transform base corresponding to the orthogonal transformer 303. The inverse quantizer 307 and the inverse orthogonal transformer 308 constitute a local decoding unit 315. The signal output from the inverse orthogonal transformer 308 is added to the frame from the motion compensated predictor 305 in the adder 316 to form a decoded image, and then temporarily stored in the reference prediction memory 309. The frame temporarily stored in the reference prediction memory 309 is used as a reference frame (reference decoded image) for calculating a difference image in the motion compensation predictor 305.
[0018]
By the way, a group from a certain I picture to a picture before the next I picture is called a GOP (Group Of Picture). A range from one IVOP to the VOP before the next IVOP is referred to as GOV (Group Of Vop). When used in storage media or the like, generally about 15 pictures are used as one GOP and GOV. Here, a structure for configuring the GOV is shown in FIG.
[0019]
As shown in the figure, the GOV start code is 32 bits, the time code indicating the time from the beginning of the sequence is 18 bits, and the closed gov indicating whether the images in the GOV are independent playback from other GOVs One bit, a broken link indicating whether the preceding GOV data is not available for editing is represented by one bit.
[0020]
The encoded bit stream output from the image encoding / transmission apparatus having the configuration shown in FIG. 8 is distributed by transmission through a storage medium or a network and reproduced by the terminal apparatus. In the case of reproduction by the terminal device, the distributed coded bit stream is reproduced after being decoded by the MPEG decoder of the terminal device.
[0021]
FIG. 10 shows a block diagram of an example of a conventional MPEG decoder. In the figure, when an encoded bit stream distributed to a terminal device is input as an input code 401 to an MPEG decoder, it is supplied to a demultiplexer 402. The demultiplexer 402 separates the input code 401 into information such as entropy-encoded motion vector information and texture information, and supplies each information to the entropy decoder 403.
[0022]
The entropy decoder 403 decodes the entropy-encoded texture information, supplies it to the inverse quantizer 404, decodes the entropy-encoded motion vector information, and supplies it to the motion compensation predictor 407. In general, the entropy decoder 403 is an IVLC (Inverse Variable Length Code) device, and performs variable length decoding on the demultiplexed data. The inverse quantizer 404 multiplies the entropy-decoded texture information by a quantized value to calculate a value close to the original orthogonal transform coefficient, and supplies the calculation result to the inverse orthogonal transformer 405. To do.
[0023]
The motion compensated predictor 407 corresponds to the macroblock to be encoded from the reference frame stored in the reference prediction memory 406 with reference to the position of the macroblock to be encoded in units of macroblocks. An area is specified by using motion vector information in reverse. Then, the motion compensated predictor 407 shifts the region on the identified reference frame by the information of the motion vector obtained from the entropy decoder 403, and arranges it at the position of the macroblock to be encoded at present. Create a prediction frame. The inverse orthogonal transformer 405 obtains original frame information or inter-frame difference information by performing inverse orthogonal transformation on the orthogonal transformation coefficient input from the inverse quantizer 404. Inverse orthogonal transform uses IDCT in MPEG.
[0024]
The frame obtained by the IDCT is added to the prediction frame from the motion compensation predictor 407 in the adder 408 and then output as an output code 409 and stored in the reference prediction memory 406. In this way, the input code 401 inputted is decoded by the MPEG decoder of the terminal device and reproduced as video information.
[0025]
[Problems to be solved by the invention]
Conventionally, when video information is to be transmitted through a transmission line, network transmission can be performed using a general MPEG configuration in order to avoid problems such as synchronization on the image receiving / decoding apparatus side. On the image encoding / transmission apparatus side configured as described above, video information is encoded in units of GOP or GOV. In general, an encoded bit stream obtained by encoding video information in units of GOV by an image encoding / transmission apparatus is an image receiving / decoding configured so that it can be transmitted over a network by using a general MPEG system configuration. The data are sequentially transmitted from the head of the GOV to the apparatus via the transmission path.
[0026]
FIG. 11 is a block diagram of an example of a conventional image encoding / transmission apparatus, and FIG. 12 is a block diagram of an example of a conventional image receiving / decoding apparatus. As shown in FIG. 11, a conventional image encoding / transmission apparatus includes an encoding unit 501 and an encoding / transmission unit 502. In general, the encoding / transmission unit 502 efficiently uses a transmission path during transmission. In order to transmit data, the GOV is divided into smaller packets and transmitted.
[0027]
The encoding unit 501 includes a buffer buffer 504 that temporarily stores an image signal from the input unit 503, a multi-frame storage memory 506 that stores at least one frame of the input image signal, a GOV management unit 505, and a multi-frame storage unit. An encoding unit 510 that encodes an image signal output from the memory 506 by the MPEG method, a buffer buffer 509 that stores encoded frames, a decoding unit 511 that decodes encoded frames, and a reference frame that is generated and encoded The frame reference prediction memory 512 supplied to the unit 510.
[0028]
In order to transmit the GOV as a coding unit, the GOV management unit 505 controls the switches SW21 and SW22 as necessary while monitoring the storage state of the multi-frame storage memory 506 and the GOV storage buffer memory 513. , The encoded output in units of GOV is output to the encoded transmission unit 502.
[0029]
The GOV storage buffer memory 513 of the encoding transmission unit 502 temporarily stores the encoded frame input from the encoding unit 510 in the encoding unit 501 via the buffer buffer 509 and the switch SW22, and then outputs it to the transmission packet sending means 514. To do. The transmission packet sending means 514 divides the packet into at least one packet that is a transmission unit for network transmission, and sends the packet to the network or storage means 515 while monitoring the state of the transmission band of the network.
[0030]
Also, as shown in FIG. 12, the conventional image reception encoding apparatus receives and decodes at least one packet sent to the network or storage means 515 (603) by the encoding transmission unit 502 of FIG. A decoding receiving unit 601 that performs decoding, a decoding unit 602 that decodes the decoded data, and an image signal from the decoding unit 602 that is stored for one frame before being supplied to the display device 615 to display an image. And frame memory 614.
[0031]
The decoding receiving unit 601 receives the input packet by the transmission packet receiving unit 604, analyzes the bit stream by the encoded bit stream analyzing unit 605, and obtains the obtained encoded frame information as a frame storage order control unit. The data is input to the GOV storage buffer memory 608 via the switch SW31 controlled by 606, and the GOV as a decoding unit is reconfigured.
[0032]
The GOV management means 609 in the decoding unit 602 controls the switch SW32 as necessary while monitoring the storage state of the GOV storage buffer memory 608, thereby transmitting the encoded data in units of GOV via the buffer buffer 611. The data is sent to the decoding unit 612. The decoding unit 612 performs decoding in units of GOV using the reference frame from the frame reference prediction memory 613, obtains a reproduced video signal, and supplies it to the display frame memory 614. The reproduced video signal for one frame stored in the display frame memory 614 is displayed as an image by the display device 615.
[0033]
Here, when the encoded bit stream is transmitted via the transmission path from the image encoding / transmission apparatus of FIG. 11 to the image reception / decoding apparatus of FIG. The information in the GOV may not be completely reconstructed due to the omission and may affect the decoding of the GOV.
[0034]
In addition, because the bit rate of the GOV currently being transmitted cannot be satisfied due to a drastic decrease in the transmission rate, it takes time to reconfigure the GOV due to packet delay or retransmission, and as a result, Transmission may be delayed.
[0035]
As described above, when the GOV transmission is not sufficient, the image receiving / decoding apparatus discards an incomplete GOV that is not in time for the reproduction time, thereby causing a problem that reproduction for one GOV is stopped.
[0036]
Also, in the image receiving / decoding apparatus, if the GOV playback time has arrived with respect to an incomplete GOV, decoding is performed to the point where decoding can be performed without discarding the incomplete GOV, and playback display is performed as much as possible. You can continue. However, even in this case, since the VOP existing in the latter half of the GOV that could not be transmitted by the reproduction time cannot be decoded, the reproduction stops until the next GOV.
[0037]
Conventionally, in order to avoid such a missing or delayed GOV, when the transmission rate of the transmission path between the image encoding transmission device and the image reception decoding device is monitored at any time and a sudden decrease in the transmission rate is detected. Requests the image coding and transmission apparatus to lower the coding bit rate. In the image encoding transmission apparatus, the encoding bit rate is switched by lowering the spatial resolution, the frame rate, and the image quality according to the required encoding bit rate, and an encoded bit stream having a lower bit rate is generated. It can be considered that the transmission is performed by the encoded transmission unit 502.
[0038]
However, in order to instruct the change of the encoding bit rate while monitoring the transmission bit rate of such a transmission line, a large GOV reception buffer is prepared in the decoding receiving unit of the image receiving decoding apparatus, and transmission is performed. It is necessary to detect the transmission bit rate of the path. Since the transmission bit rate of the transmission line is detected by monitoring the accumulation state of the GOV reception buffer, a certain time difference is generated until the fluctuation of the transmission bit rate is detected. Due to this time difference, an attempt is made to continue to transmit the encoded bit stream with the high encoding bit rate to the transmission line with the low transmission bit rate, so that the transmission delay increases.
[0039]
Further, in the encoding transmission unit 502 in the image encoding transmission apparatus shown in FIG. 11, the encoded bit stream output from the encoding unit 510 in the encoding unit 501 via the buffer buffer 509 and the switch SW22 is changed to GOV. Consider a case in which the GOV is packetized by the transmission packet sending means 514 after being stored in the storage buffer memory 513 and transmitted to the transmission path as the network or the storage means 515.
[0040]
After the encoded bit stream output from the encoding unit 501 is stored in the GOV storage buffer memory 513, the transmission packet transmission means 514 converts the GOV into a packet and sends it to the transmission line. When the image encoding / transmission apparatus detects that the transmission bit rate has decreased, it is assumed that the encoding bit rate of the GOV currently being sent is being changed from the middle.
[0041]
In a normal image encoding / transmission apparatus, if the GOV is stored in the GOV storage buffer memory 513 of the encoding / transmission unit 502, the encoding bit rate cannot be corrected for the stored GOV. A time difference will occur in the control of the coding bit rate.
[0042]
In the case of a more advanced image encoding transmission apparatus, a GOV that is not yet sent to the transmission path is analyzed and a VOP that has not been sent to the transmission path is specified. Thereafter, the GOV currently being transmitted is deleted from the GOV storage buffer memory 513, and a re-encoding request is made from the specified VOP to the encoding unit 501 again. If VOPs corresponding to the number of remaining frames can be acquired from the specified VOP, the transmission delay of the GOV is small, and the reduction in reproduction goods on the image reception decoding apparatus side can be suppressed.
[0043]
However, in the configuration of such an image encoding / transmission apparatus and image receiving / decoding apparatus, when live video or the like is distributed in real time, re-encoding processing can be performed on the image encoding / transmission apparatus side. Therefore, a large amount of frame storage buffer memory is required. There is also a problem that complicated control of the encoding unit 501 is required.
[0044]
The present invention has been made in view of the above points. When an encoded bit stream is transmitted from an image encoding / transmission apparatus to an image reception / decoding apparatus via a transmission path, the transmission rate varies greatly. Even if it is difficult to completely reconstruct the information in the GOV due to packet loss in the conventional method, by considering the encoding method and transmission order of the encoded frame in the GOV, the image reception decoding device side Using the GOV information transmitted up to the decoding time, the frame rate on the decoding side automatically changes according to the number of frames in the GOV accumulated up to the decoding time, so that frames can be reproduced and displayed as equally as possible. Thus, an object of the present invention is to provide an image encoding / transmission apparatus that can reduce the influence on GOV decoding and can realize higher-quality image transmission than before. That.
[0045]
[Means for Solving the Problems]
In order to achieve the above object, the present invention encodes an input image signal using a reference frame, and transmits the obtained encoded frame in accordance with a predetermined frame transmission order determined in advance. A first frame storage memory for storing at least one frame of an input image signal, a second frame storage memory for storing a predetermined number of reference frames used for encoding, and a first frame storage The first switch means provided on the output side of the memory, the second switch means provided on the input side and the output side of the second frame storage memory, and the first frame means according to a predetermined frame sending order. And reference frame control for controlling the transfer of the input image signal and the reference frame stored in the first and second frame storage memories by controlling the second switch means. And an input image signal having a frame number selected by the first switch means from the input image signals stored in the first frame storage memory is stored in the second frame storage memory. Using one reference frame selected by the second switch means from among the reference frames, the encoding unit obtains an encoded frame by encoding, the encoded frame output from the encoding unit is decoded, and the second A reference frame is newly generated from the reference frames stored in the frame storage memory using one reference frame used at the time of encoding by the encoding unit, and the new reference frame is converted into the second reference frame. The local decoding unit newly stored in the second frame storage memory via the switch means and the accumulation status of the first frame storage memory are monitored. I / O frame management means for managing the storage state and monitoring the encoded frames output from the encoding unit, managing the encoded output, and notifying the reference frame control means of these states, A third frame storage memory for storing a group of encoded frames generated by the encoding unit, and each frame included in the group of encoded frames stored in the third frame storage memory are packetized in advance. And an output means for transmitting in accordance with a predetermined frame transmission order.
[0046]
In the present invention, the input image signal having the frame number selected by the first switch means from the input image signals stored in the first frame storage memory is stored in the second frame storage memory. The local decoding unit decodes the encoded frame output from the encoding unit that obtains the encoded frame by performing encoding using one reference frame selected by the second switch means from among the reference frames Then, a reference frame is newly generated from one of the reference frames stored in the second frame storage memory using the one reference frame used for encoding by the encoding unit, and the new reference frame is Since the second frame is newly stored in the second frame storage memory via the second switch means, the encoding unit encodes the second The frames stored in memory for frame storage can be used as arbitrary reference frame.
[0047]
Here, the output means includes an encoded bit stream analyzing means for specifying each frame included in a group of encoded frames stored in the third frame storage memory, and a third frame storage memory. Frame transmission control means for controlling the order in which frames included in the group of encoded frames stored in the frame are transmitted to the transmission line according to a predetermined frame transmission order.
[0048]
The encoded bit stream analyzing means specifies each encoded frame included in the GOV stored in the third frame storage memory. The frame transmission control means uses the position information of each encoded frame specified by the encoded bitstream analyzing means to change the order of the encoded frames in the GOV stored in the third frame storage memory. The GOV can be changed in accordance with a predetermined frame transmission order when it is transmitted to the transmission line.
[0049]
In order to achieve the above object, according to the present invention, the first and second switches are configured such that the reference frame control means uses the sending frame number determined by the computing means as the predetermined frame sending order in the order determined. And the arithmetic means sets the number of encoded frames included in the encoded frame group as M, and the frame number of the input image signal stored in the first frame storage memory ranges from 0 to M−. If it is a natural number up to 1, the frame number 0 is the first transmission frame number, the initial value of the variable A is initialized to M, the initial value of the count value C is initialized to 1, and the initialization unit initializes the initial value. After conversion,
B = [(A + 1) / 2] ([] is a Gaussian symbol)
The value of B is calculated by using the first calculation means for setting the variable n to 0, the value of B obtained by the first calculation means and the value of the variable n, and
D = B + 2 × B × n (n is a positive integer)
D <M by the second calculating means for calculating D by the first calculating means, the first determining means for determining whether or not the value of D calculated by the second calculating means is less than M, and the first determining means. Is determined as the transmission frame number, and the variable n and the count value C are incremented by 1, respectively, and the transmission frame number is determined by the second calculation means to calculate D again. When the D and the first determination means determine that D is M or more, the second determination means and the second determination means determine whether or not the count value C is greater than M. When the count value C is determined to be less than or equal to M, the first calculation means sets the A value to the current B value, and then the first calculation means calculates B. By the determination means of 2, the count value C is M And performing the above-described operation until it is determined to be large.
[0050]
In the present invention, the predetermined frame transmission order is set to the determined order of the transmission frame numbers determined by the calculation means, and this is used by the reference frame control means and the output means, so that the image reception decoding apparatus side Even when decoding is performed in the order of the encoded frames received in step 1, the encoded transmission can be performed so that the frame rate is improved every time the encoded frame is decoded within the GOV unit.
[0051]
In the present invention, at least one of the frame positions which are already transmitted before the frame to be encoded based on a predetermined transmission frame order in the reference frame control means is provided. Managing input / output of frames from the first frame storage memory and the second frame storage memory to the encoding unit and the local decoding unit so that encoding is performed using one reference frame. . With this management, even when decoding is performed in the order of the encoded frames received on the image receiving / decoding apparatus side, the encoded output is controlled so that the frame rate is improved every time the encoded frame is decoded within the GOV unit. can do.
[0052]
Further, according to the present invention, in the above reference frame, in response to a reference frame switching request from the encoding unit, the frame is already transmitted before the frame to be encoded based on the transmission frame order obtained by the arithmetic means. Among the at least one reference frame of at least one frame position that is to be encoded, when encoding the frame that is currently to be encoded, it is not yet used as a reference frame or a combination of reference frames. Input frames from the first frame storage memory and the second frame storage memory into the encoding unit and the local decoding unit so that encoding is performed again by switching to another reference frame or a combination of a plurality of reference frames. It is characterized by managing output.
[0053]
The present invention further includes means for monitoring an encoding bit rate at the time of encoding in the encoding unit, and for performing re-encoding to the reference frame control means in accordance with a predetermined encoding output condition, A means for making a switching request is provided. Thereby, a predetermined encoding output condition is satisfied.
[0054]
Further, according to the present invention, the predetermined encoding output condition is already transmitted prior to the frame to be encoded based on the transmission frame order obtained by the calculation means of claim 2. The encoded frame is output when the minimum encoded bit rate is reached from at least one reference frame at at least one frame position. According to the present invention, an optimum encoded output can be obtained when encoding is performed on each frame in the GOV based on the order of transmission frames.
[0055]
Further, according to the present invention, the content of the third frame storage memory capable of storing the group of at least one encoded frame generated by the encoding unit is the encoding unit of the next at least one encoded frame. And is discarded when the time for transmission is reached, and a group of at least one encoded frame generated by the encoding unit is quickly stored. In the present invention, the encoding bit rate is controlled without constantly monitoring the transmission bit rate of the transmission line, and the transmission delay is increased due to a certain time difference required to detect the fluctuation of the transmission bit rate, and is given to the GOV decoding. The impact can be reduced.
[0056]
DETAILED DESCRIPTION OF THE INVENTION
Next, embodiments of the present invention will be described with reference to the drawings. FIG. 1 shows a block diagram of an embodiment of an image encoding and transmitting apparatus using an arbitrary reference frame according to the present invention. The image coding / transmission apparatus according to this embodiment mainly includes an encoding unit 101 that can encode an input image using at least one reference frame, and a group of at least one coding frame. An encoding transmission unit 102 that can efficiently transmit a GOV to a transmission path, an input unit 103 that enables the encoding unit 101 to acquire an image to be encoded and transmitted, and an encoded frame as an image It comprises a network or storage means 122 that can transmit to the receiving and decoding device.
[0057]
Further, in the present embodiment, the encoding unit 101 includes a buffer buffer 104, switches SW1 to SW5, a first memory for storing a plurality of frames 106, an input / output frame management unit 107, a buffer buffer 109, a reference frame control unit 110, and an encoding unit. 111, a second memory for storing a plurality of frames 113, and a local decoding unit 116. Here, the encoding unit 101 has a configuration in which the frame reference prediction memory 512 of the encoding unit 501 shown in FIG. 11 is expanded to the second multi-frame storage memory 113. Accordingly, the GOV of FIG. In FIG. 1, the management unit 505 is extended to the reference frame control unit 110 and the input / output frame management unit 107.
[0058]
1 corresponds to the local decoding unit 315 in FIG. 8, the second multi-frame storage memory 113 in FIG. 1 corresponds to the reference prediction memory 309 in FIG. 8, and the encoding unit 111 , Corresponding to the remaining part of FIG.
[0059]
The description regarding each component of the encoding part 101 is given below. The buffer 104 has a buffer area sufficient to temporarily store a plurality of image frames input from the input means 103, and stores frames for GOV in the first plurality of frames storage memory 106. This is a buffer for making adjustments.
[0060]
The switch SW1 is for managing the connection between the buffer buffer 104 and the first multiple-frame storage memory 106. The switch SW1 is connected, disconnected, and switched by the input / output frame management means 107. The first multi-frame storage memory 106 needs to be able to store at least the number of input frames necessary for configuring the GOV so that the encoding unit 111 can perform encoding in units of GOV.
[0061]
For example, when the frame rate of the input image is 30 [fps], 1 GOV is 0.5 seconds, and the number of encoded frames stored in 1 GOV is 15, the first multi-frame storage memory 106 is at least 15 A memory area capable of storing input images for frames is secured.
[0062]
The input / output frame management unit 107 stores the input frame from the buffer buffer 104 to the first plural-frame storage memory 106 by connecting / disconnecting / switching the switch SW1 in order to control encoding in units of GOV. to manage. Further, the input / output frame management means 107 manages the output signal from the buffer buffer 109 by connecting, disconnecting, and switching the switch SW5. Further, the input / output frame management unit 107 notifies the reference frame control unit 110 of a request to start encoding in GOV units, thereby managing the timing for starting encoding in GOV units.
[0063]
That is, the input / output frame management means 107 monitors whether or not an input frame necessary for encoding the GOV is stored in the first plural-frame storage memory 106. By confirming that sufficient input frames are stored in the first multi-frame storage memory 106 in units of GOV, encoding is performed in an arbitrary frame order included in the GOV in units of GOV. Will be able to.
[0064]
Further, by monitoring the encoded frame output from the encoding unit 111, it is possible to control the storage state of the third multi-frame storage memory 117. Further, by notifying the reference frame control means 110 of the storage status of the first multiple frame storage memory 106 and the third multiple frame storage memory 117, the reference frame control means 110 is able to It becomes possible to control at least one reference frame.
[0065]
The switch SW5 is for managing the output timing of the GOV temporarily stored in the buffer buffer 109. The timing of connection / disconnection / switching of the switch SW5 is managed by the input / output frame management means 107. The buffer buffer 109 is a buffer buffer for temporarily storing the GOV encoded frame output from the encoding unit 111 in order to output it to the encoding transmission unit in GOV units.
[0066]
The switch SW2 is a switch for managing the connection between the first multiple-frame storage memory 106 and the encoding unit 111, and is controlled to be connected, disconnected, and switched by the reference frame control unit 110. The second multi-frame storage memory 113 stores at least the number of frames necessary for configuring the GOV so that the reference frame generated by the local decoding unit 116 can be stored at the corresponding frame position in the GOV. There is a need to have a frame memory that can. For example, when the frame rate of the input image is 30 [fps], 1 GOV is 0.5 seconds, and the number of encoded frames stored in 1 GOV is 15, the second multi-frame storage memory 113 is at least 15 A memory capacity capable of storing reference frames for frames is secured.
[0067]
The switch SW3 is a switch for managing the connection between the local decoding unit 116 and the second multi-frame storage memory 113, and is controlled to be connected, disconnected, and switched by the reference frame control means 110. The switch SW4 is a switch for managing the connection between the second multi-frame storage memory 113 and the encoding unit 111, and the second multi-frame storage memory 113 and the local decoding unit 116, and is connected by the reference frame control unit 110. And cutting and switching control.
[0068]
The reference frame control unit 110 controls connection, disconnection, and switching of the switch SW2 to identify an input frame to be input to the encoding unit 111 based on a predetermined frame transmission order, and to input a correct input frame to the encoding unit 111. I do. Further, the reference frame control means 110 manages the connection, disconnection, and switching of the switch SW3, thereby generating the reference frame generated by the local decoding unit 116 in the corresponding reference frame position in the second multiple-frame storage memory 113. Control so that it can be stored correctly.
[0069]
Further, the reference frame control means 110 manages the connection, disconnection, and switching of the switch SW4, so that the predetermined frame sending order is applied to the position of the input frame that is currently being encoded. In the frame transmission order, control is performed so that reference frames that have already been transmitted can be used for encoding prediction.
[0070]
The local decoding unit 116 has a configuration like the local decoding unit 115 included in an MPEG encoding apparatus as shown in FIG. 8, for example. That is, the encoded frame output from the encoding unit 111 to the local decoding unit 116 corresponds to the encoded frame after quantization by the quantizer on the input side of the entropy encoder in FIG. The unit 116 includes at least an inverse quantizer and an inverse orthogonal transformer.
[0071]
In addition, when the local decoding unit 116 obtains the encoded frame output from the encoding unit 111, the local decoding unit 116 performs inverse quantization by an inverse quantizer, and then performs inverse orthogonal transformation to perform current encoding. A decoded frame of the encoded frame is obtained. The local decoding unit 116 outputs this decoded frame as a reference frame to the second multi-frame storage memory 113 via the switch SW3, and prepares for the encoding of the next input frame.
[0072]
For example, the encoding unit 111 encodes an IVOP for an input frame that can be acquired from the first multiple-frame storage memory 106 when encoding an IVOP, such as an MPEG encoding device as shown in FIG. A predetermined orthogonal transformation is performed. Here, the predetermined orthogonal transform employs DCT in MPEG. However, if the encoding unit 111, the local decoding unit 116, and the decoding unit 216 of the image reception decoding device in the image encoding / transmission device of the present embodiment all use the same orthogonal transformation, a predetermined orthogonal transformation is performed by Hadamard. It should be noted that other orthogonal transformations such as transformation and wavelet transformation may be used and are not particularly limited.
[0073]
Thereafter, the amount of information is reduced by the quantizer in the encoding unit 111 having the configuration shown in FIG. In the case of PVOP or BVOP, an input frame that can be acquired from the first multi-frame storage memory 106 and a reference frame that can be acquired from the second multi-frame storage memory 113 are used as reference frames. On the other hand, a predetermined orthogonal transformation is performed on the difference frame between the frame subjected to motion compensation and the input frame. Thereafter, the information amount is reduced by the quantizer in the encoding unit 111.
[0074]
The encoding unit 111 outputs the encoded data at this point to the local decoding unit 116. Furthermore, entropy encoding is performed on the encoded data after quantization. Thereafter, additional information such as entropy-encoded vector information is multiplexed by a multiplexer in the encoding unit 111. The encoding unit 111 outputs the encoded frame encoded in this way to the buffer buffer 109.
[0075]
Further, when it is necessary to minimize the code amount at the time of encoding, the code amount controller monitors the buffer memory in the encoding unit 111 and performs at least one encoding using at least one reference frame. It is necessary to monitor the code amount of the encoding result that is output each time encoding is performed, and to use the encoding result when the code amount is the smallest as the encoded frame output. Therefore, the code amount controller in the encoding unit 111 can notify the reference frame control unit 110 of a reference frame change request.
[0076]
When the reference frame control means 110 obtains this reference frame change request, another reference frame is selected from the reference frames already encoded and output before the input frame currently encoded based on the predetermined frame transmission order. Switch to. Thereby, the encoding unit 111 can acquire a new reference frame based on a predetermined frame transmission order.
[0077]
By repeating this operation, it is possible to compare the code amounts when encoding with a plurality of reference frames, and the encoding unit 111 identifies the reference frame having the minimum code amount, and encodes with the minimum code amount. The frame can be output to the buffer buffer 109.
[0078]
For example, it is assumed that 15 frames are encoded as 1 GOV, encoding is performed based on a predetermined frame transmission order as shown in FIG. 2, and encoding is currently performed up to the third transmission order. Assume that you are moving forward. That is, in the frame number, the frames up to the 0th, 8th, 4th, and 12th frames are already encoded and stored as the reference frames in the second multi-frame storage memory 113 of FIG.
[0079]
In the case of normal encoding, the reference frame stores only one frame immediately after encoding, and the reference frame that was previously encoded cannot be used. However, in the present embodiment, by controlling the switch SW4, the reference frame generated by the encoding up to the present time, that is, the frame number 0 stored in the second multi-frame storage memory 113 here. , 8, 4, 12 can be used as reference frames.
[0080]
Therefore, the reference frame to be used when encoding the next encoding target frame number 2 (4 in the transmission order) can be selected from four frame numbers 0, 8, 4, and 12. it can. Therefore, encoding is performed in each of these four frames, the output code amount at that time is obtained, the output code amounts are compared, and the reference frame that outputs the minimum code amount can be specified.
[0081]
Next, the encoded transmission unit 102 will be described. In the present embodiment, the encoded transmission unit 102 includes a third memory for storing a plurality of frames 117, an encoded bitstream analysis unit 118, a frame transmission order control unit 119, a switch SW6, and a transmission packet transmission unit 121. The description regarding each component of the encoding transmission part 102 is given below.
[0082]
The third multi-frame storage memory 117 is a memory that can acquire and store the GOV via the switch SW5 from the buffer buffer 109 that temporarily stores the encoded frames generated by the encoding unit 111 for the GOV. It has. The memory amount of the third plural-frame storage memory 117 is secured by calculating in advance from the maximum bit rate of the encoding unit 101 and the maximum bit rate of the network or storage means 122 serving as a transmission path.
[0083]
The encoded bitstream analysis means 118 analyzes the GOV GOV header stored in the third multi-frame storage memory 117 and the header information of each encoded frame, and transmits each encoded frame in a predetermined frame transmission order. Information for identifying the position of the encoded frame in the GOV that is necessary for transmission through the path is collected.
[0084]
Also, the encoded bitstream analyzing unit 118 transmits the GOV stored in the third plural-frame storage memory 117 through the transmission path by making a request for starting the encoded transmission to the frame transmission order control unit 119. Manage timing. Also, the information for identifying the position of the encoded frame within the collected GOV is notified to the frame transmission order control means 119. Furthermore, a means is provided for acquiring from the frame transmission order control means 119 that transmission of the encoded frame in the GOV has been completed.
[0085]
In addition, the encoded bitstream analyzing unit 118 acquires an encoded frame from the third multi-frame storage memory 117 via the switch SW6 and outputs the encoded frame to the transmission packet transmitting unit 121. At that time, if correction is necessary with information such as time code (time_code) in GOV header information, modulo time base (modulo_time_base), and vop time increment (vop_time_increment) in the header information of the encoded frame The encoded bitstream analyzing means 118 preferably includes means capable of correcting these header information.
[0086]
The frame transmission order control means 119 connects and disconnects the switch SW6 based on information for specifying the position of the encoded frame in the GOV notified by the encoded bitstream analysis means 118 and the predetermined frame transmission order. By switching, the transmission order of the encoded frames in the GOV stored in the third multi-frame storage memory 117 is managed. Here, the encoded frames generated by the encoding unit 111 are input to the third multiple-frame storage memory 117 in the order of generation, and the encoded frames are output from the third multiple-frame storage memory 117 in the input order. Is controlled to switch the switch SW6.
[0087]
In addition, when the frame transmission order control unit 119 detects that all the encoded frames in the GOV have been transmitted when the switch SW6 is switched, the frame transmission order control unit 119 Notify that transmission of the encoded frame in the GOV has been completed.
[0088]
The switch SW6 is a switch for managing the connection between the third plural-frame storage memory 117 and the encoded bitstream analyzing unit 118, and is controlled to be connected, disconnected, and switched by the frame transmission order control unit 119. The transmission packet sending unit 121 acquires the encoded frame from the encoded bitstream analysis unit 118, disassembles the acquired encoded frame into at least one packet, and transmits it to the network or storage unit 122 which is a transmission path. To do.
[0089]
By providing the transmission packet sending means 121 for packetizing and sending the coded frame to be sent out to the transmission line, the coded frame to be currently transmitted in the GOV specified by the frame sending order control means 119 is displayed. The packet can be transmitted in a packet size that can be efficiently transmitted on the transmission path. By providing the above-described configuration and the function of each component, an embodiment of the image coding and transmitting apparatus of the present invention can be realized.
[0090]
Next, in the operation of the image encoding / transmission apparatus and the image receiving / decoding apparatus according to the present invention, since the predetermined frame transmission order is an important factor, the predetermined frame transmission order and the reference frame at the time of encoding The selection method will be described below.
[0091]
First, a method for determining a predetermined frame transmission order will be described below with reference to FIGS. FIG. 2 is a diagram illustrating the order in which the encoded frames included in the GOV are transmitted and which reference frame is used to encode each encoded frame included in the GOV. FIG. 3 is a flowchart showing a method for obtaining a predetermined frame transmission order in the present invention. Note that the processing of FIG. 3 is performed externally in advance, and information regarding the transmission frame order as a result thereof is stored in the reference frame control unit 110 in advance so that the information can be used. This is performed at the stage of initialization processing performed when the entire coding transmission apparatus is activated.
[0092]
First, a set of encoded frames, that is, the number of encoded frames included in the GOV in this embodiment is M. The stored frame number can be expressed as an integer from 0 to M-1. Here, frame number 0 is the first transmission frame number. In the following description, it is assumed that the GOV is composed of PVOP except that M = 15 and the first frame in the GOV is IVOP.
[0093]
Initially, A = M and C = 1 are initialized (step S101), where A is a variable used during the calculation. C is a counter variable for counting how many frame transmission orders are obtained. It is assumed that the frame number to be transmitted first is the 0th frame, and the frame transmission order already obtained is 1, so such initialization is performed. Since M = 15 now, initialization is performed with A = 15 and C = 1.
[0094]
Next, the following formula
B = [(A + 1) / 2]
To calculate the variable B (step S102). However, in the above formula, [] indicates a Gaussian symbol. Subsequently, after initializing the variable n to 0 (step S103),
D = B + 2 × B × n
To calculate the variable D (step S104).
[0095]
D obtained in this step S104 is considered to be a sequence generated by a geometric sequence having an initial term of B and a common ratio of 2 × B, where n is a positive integer. Therefore, the frame number corresponding to the next frame transmission order is obtained from the position of frame 0 at the interval B.
[0096]
Next, it is determined whether D <M (step S105). If it is determined in step S105 that D <M, it means that the frame position in the frame transmission order obtained in step S104 is within the frame position existing in the GOV. If D <M, the process proceeds to step S106, where D obtained in step S104 is adopted as the next frame transmission order, and this value is stored as the next frame transmission number.
[0097]
Then, it progresses to step S107 and C = C + 1 and n = n + 1 are calculated | required. In step S107, the counter C is incremented by one because the frame transmission order is newly obtained. In addition, n is increased by one to obtain the next number sequence. When the process of step S107 is completed, the process proceeds to step S104, and the value of the next number sequence is obtained. On the other hand, if it is determined in step S105 that D <M, it means that the frame position in the frame transmission order obtained in step S104 is not within the frame position existing in the GOV.
[0098]
Here, it is confirmed that the process has been actually performed in the process so far. First, the first frame transmission order starts from frame 0. Since the next frame transmission order is the first term B, that is, B = 8, the frame 8 is adopted as the next frame transmission order. Thereafter, since it increases at a common ratio of 2 × B, when D is calculated again, D = 8 + 2 × 8 × 1 = 24. Since this value exceeds M = 15 and cannot exist in the GOV, it is determined whether C> M (step S108).
[0099]
If C> M, this means that all frames included in the GOV have been adopted as the frame transmission order, and thus the process for obtaining the frame transmission order is terminated. On the other hand, if it is determined in step S108 that C> M is not satisfied, the process proceeds to step S109, where A = B, that is, the value of A is set to the value of B. Then, it returns to step S102 and continues processing.
[0100]
By performing the processing as described above, the frame transmission order of the present invention can be obtained. For reference, the actual processing from step S102 will be described a little more continuously. Currently, in step S108, the obtained frame transmission order is the order of frame 0 and frame 8, and the process is continued from the place where A = 15, B = 8, D = 24, C = 2, and n = 1. To do.
[0101]
Since C> M is not satisfied at this time, the process proceeds to step S109. In step S109, since the value of A is updated to the value “8” of B, the value is updated to A = 8. In step S102, B = [(A + 1) / 2] is obtained again. Since A is updated to 8, B = [4.5] = 4 is obtained. After that, n = 0 is reset in step S103, and the frame transmission order is obtained using new B = 4 in step S104. Then, the value of D changes like 4 (when n = 0), 12 (when n = 1), and 20 (when n = 2) (repetition of steps S104 to S107). Accordingly, since D <M is not satisfied at the time point of n = 2, the above process is repeated after updating the value of A to the value of B (4 at this time) again in step S109. Therefore, until now, the frame transmission order has been obtained from frame 0 → frame 8 → frame 4 → frame 12. As for the subsequent frame transmission order, an order as shown in “transmission order” in FIG.
[0102]
2C and 2D, “transmission order” and “reference frame number” are described on four horizontal lines, but this is shown for convenience of illustration. In terms of time, it changes from left to right and from top to bottom. That is, what is shown in FIG. 2C is the order of transmission, not the frame number, but the frame number 0 shown in FIG. Next, a frame of frame number 8 shown in FIG. 2B is transmitted at the position indicated by 1 in FIG. 1C, and then the frame shown in FIG. 2B at the position indicated by 2 in FIG. The frame with the number 4 is transmitted, and then the frame with the frame number 12 shown in FIG. 5B at the position indicated by 3 in the figure (C) is transmitted. Each frame is transmitted in the order of frame numbers 10,..., And finally, a frame with frame number 13 shown in FIG. 2 (B) at the position shown in FIG.
[0103]
Next, a reference frame selection method at the time of encoding based on the frame transmission order of the present invention will be described. First, the frame type is schematically shown in FIG. 2A by surrounding it with a square. The head of the encoded frame in the GOV is IVOP as indicated by I, and the second and subsequent encoded frames are P. As shown in the figure, it is constituted by PVOP, the GOV is 0.5 seconds, and the encoded frames included in the GOV are 15 frames. In addition, it is assumed that the order in which the encoded frames are transmitted is the predetermined frame transmission order in the present embodiment as shown in FIG.
[0104]
The frame encoded first in the GOV is frame 0, which is the first frame of the GOV. This frame 0 is encoded as IVOP. Next, as shown by “transmission order” in FIG. 2D, it can be seen that the frame of frame number 8 (frame 8) is the frame to be encoded next.
[0105]
Conventionally, when this frame 8 is to be encoded, one of the input frames input at a certain sample interval is this frame 8. Since this frame 8 needs to be encoded as PVOP, a normal reference frame uses a reference frame of a frame encoded before this frame 8.
[0106]
In other words, in the conventional method, when the frame to be encoded is the frame 8 and the previous encoded frame is the frame 0, the sampling interval when performing the encoding is 8. . This means that when the subsequent frames are encoded, the sampling interval must be 8 and the frame rate in this case is 2 [fps].
[0107]
On the other hand, in the present embodiment, since it is possible to use a reference frame of a frame encoded before the frame to be currently encoded, as shown in FIG. Frame 8 is encoded with reference to frame 0. Subsequently, the frame 4 to be transmitted third can refer to the frame 0 and the frame 8, but here, as illustrated in FIG. 2D, the frame 0 is referred to. Further, the frame 12 transmitted fourth can refer to the frames 0, 8, and 4. Here, as shown in FIG. 2D, the frame 8 closest to the frame 12 is referred to. ing.
[0108]
As described above, in the present embodiment, as shown in FIG. 2D, according to the transmission order, and closest to the reference frame at the already encoded frame position in the GOV. Reference frames can be used.
[0109]
Furthermore, since the present invention performs encoding according to a predetermined frame transmission order, the frame interval at the time of encoding does not occur as described in the conventional method, and the frame interval is gradually shortened. Encoding proceeds. Thus, the frame rate on the decoding side can be increased by the number of frames that can be decoded by the predetermined frame transmission order and the reference frame selection method of the present invention.
[0110]
By adopting the method for determining the frame transmission order as described above, it is possible to transmit the encoded frame via the transmission path in such a frame transmission order and decode it in the order transmitted by the image reception decoding apparatus. When encoding is performed by the image encoding transmission apparatus as possible, a new effect is produced.
[0111]
In the example of the GOV configuration in FIG. 2, initially, playback frames are decoded at a time interval of 8 frames, which is about half of the number of frames M = 15 included in the GOV. Thereafter, the interval between frames to be decoded is further reduced to half of 8/2 = 4 frames, and finally the interval between frames to be decoded becomes one frame, which is the original frame rate.
[0112]
Here, it is assumed that the transmission bit rate of the transmission path is drastically reduced and only half of the encoded frames included in the GOV are transmitted. As described above, 1 GOV is for 0.5 seconds, the number of frames M included in the GOV is M = 15, and the decoding side can receive only 8 frames out of 15 GOV frames. To do.
[0113]
In the conventional method, since the frame rate is 30 [fps] as it is, the images for 8 frames that can be received, that is, the images for about 0.25 seconds, are displayed smoothly, but the remaining images after that are displayed. Since there is no frame to display for about 0.25 seconds, the moving image display is stopped for about 0.25 seconds. This means that the content that occurred in the moving image display in the latter half of 0.25 seconds is completely lost. If such a phenomenon occurs, assuming that video distribution is performed in sequence using VOD (Video On Demand) or surveillance cameras, the possibility of missing important information in the video increases, which is important for video quality. Will have a negative impact.
[0114]
On the other hand, in this embodiment, even if only the first 8 frames in the GOV can be received, the frame rate is automatically set at the time of decoding by the encoding based on the frame transmission order of this embodiment. The encoded frame in the GOV can be configured so that it can be reproduced and displayed at equal intervals as much as possible in units of GOV. In this embodiment, the smoothness of the video display is reduced as the frame rate is reduced, and there is a possibility that the video content that occurred between frames may be lost, but the video content is completely lost. Can be avoided.
[0115]
This effect is even more pronounced when the encoded frame is transmitted in an environment in which the transmission bit rate of the transmission path fluctuates frequently. In the conventional method, when the transmission rate fluctuates frequently, video information is reproduced in small pieces, and it becomes extremely difficult to grasp the contents of the entire video. However, as in the present embodiment, the frame rate dynamically changes according to the received frame, so that the content of the entire video is reproduced without much concern, and the frame rate is extremely high. Since fluctuations can also be avoided, it can be seen that this is a very effective technique for an environment where the transmission rate fluctuates frequently.
[0116]
In addition, a new effect other than the above effects also occurs. Consider a case where the encoded frame transmitted by the transmission path is a bit stream encoded in advance. In the case of a bit stream encoded by the conventional method, the frame rate at the time of decoding is determined at the time of encoding. Therefore, if the frame rate is set as a control specification as necessary, some transcoding means is required.
[0117]
On the other hand, according to the present embodiment, the GOV is reproduced and displayed at a frame rate corresponding to the number of frames decoded so far by decoding the bitstream once generated in units of GOV and stopping the decoding at a certain position. Is possible. Furthermore, in the present embodiment, even if the decoding capability on the image receiving / decoding device side is low, for the same reason, playback and display can be performed at an appropriate frame rate according to the number of frames that can be decoded. It becomes.
[0118]
Next, with regard to the operation of the image coding / transmission apparatus according to the present invention, refer to the block diagram of one embodiment of the image coding / transmission apparatus according to the present invention of FIG. 1 and the flowcharts of FIGS. This will be described below. FIG. 4 is a flowchart showing the operation of the encoder 101 in FIG.
[0119]
Hereinafter, the operation of the encoder 101 in FIG. 1 will be described. When image input is started by the input unit 103, the encoder 101 stores the input frame acquired from the input unit 103 in the buffer buffer 104 (step S201 in FIG. 4). At this point, it is assumed that the number of frames necessary for configuring the GOV has already been stored in the buffer buffer 104.
[0120]
In step S <b> 202, the input / output frame management unit 107 determines whether the buffer buffer 104 stores a sufficient number of frames necessary for configuring the GOV. If enough frames are stored in the buffer 104, the process proceeds to step S203. If there are not enough frames stored in the buffer buffer 104, the process proceeds to the terminal (3) and the processing of the encoding unit 101 is terminated.
[0121]
In step S203, the input / output frame management means 107 connects the switch SW1. When the switch SW1 is connected by the input / output frame management means 107, the buffer buffer 104 can store the number of frames necessary for configuring the GOV in the first multiple-frame storage memory 106. The first multiple-frame storage memory 106 stores the input frames input from the buffer buffer 104 for the number of frames necessary for configuring the GOV (step S204 in FIG. 4).
[0122]
When the number of frames necessary for configuring the GOV is sufficiently stored in the first plural-frame storage memory 106, the input / output frame management means 107 disconnects the switches SW1 and SW5 (step S205 in FIG. 4). Subsequently, the input / output frame management unit 107 notifies the reference frame control unit 110 of an encoding start request (step S206 in FIG. 4). Thereafter, the reference frame control means 110 connects the switches SW2 and SW3 to a position where the first frame of the GOV can be acquired (step S207 in FIG. 4), and further disconnects the switch SW4 (step S208 in FIG. 4). .
[0123]
Subsequently, the reference frame control unit 110 notifies the encoding unit 111 of an encoding start request (step S209 in FIG. 4). Then, the encoding unit 111 acquires an input frame from the first multi-frame storage memory 106 via the switch SW2 (step S210 in FIG. 4), and encodes the input frame (step S211 in FIG. 4). . Then, the encoding unit 111 outputs the encoded frame to the buffer buffer 109 (step S212 in FIG. 4). Further, the local decoding unit 116 acquires an encoded frame from the encoding unit 111 (step S213 in FIG. 4).
[0124]
The local decoding unit 116 decodes the acquired encoded frame to generate a reference frame (step S214 in FIG. 4), and stores the reference frame in the second multiple-frame storage memory 113 via the switch SW3 (FIG. 4). 4 step S215). Next, the reference frame control unit 110 determines whether or not all the encoded frames in the GOV have been prepared from the current encoded frame position based on a predetermined frame transmission order (step S216 in FIG. 4). If all the encoded frames are ready, the process proceeds to step S226 corresponding to terminal (1). If all the encoded frames are not complete, the process proceeds to step S217.
[0125]
When all the encoded frames are not complete, the reference frame control unit 110 switches the switch SW2, the switch SW3, and the switch SW4 to positions where the next frame of the GOV can be acquired based on a predetermined frame transmission unit. Thereafter, the reference frame control unit 110 notifies the encoding unit 111 of an encoding start request (step S218 in FIG. 4).
[0126]
Based on the above encoding start request, the encoding unit 111 acquires an input frame from the first multiple-frame storage memory 106 via the switch SW2, and refers to the second multiple-frame storage memory 113 via the switch SW4. A frame is acquired (step S219 in FIG. 4). Thereafter, the encoding unit 111 performs encoding using the acquired input frame and reference frame (step S220 in FIG. 4), and outputs the obtained encoded frame to the buffer buffer 109 (step S221 in FIG. 4).
[0127]
Next, the local decoding unit 116 acquires an encoded frame from the encoding unit 111 and also acquires a reference frame from the second multiple-frame storage memory 113 via the switch SW4 (step S222 in FIG. 4). Subsequently, the local decoding unit 116 generates a reference frame by decoding the acquired encoded frame and reference frame (step S223 in FIG. 4), and stores the reference frame in the second plurality of frames via the switch SW3. (Step S224 in FIG. 4).
[0128]
Subsequently, the reference frame control means 110 determines whether or not all the encoded frames in the GOV are prepared from the current encoded frame position based on a predetermined frame transmission order (step S225 in FIG. 4). If all the encoded frames are ready, the process proceeds to step S226. If all the encoded frames are not complete, the process proceeds to step S217, and the processes of steps S217 to S224 described above are performed again.
[0129]
If the reference frame control unit 110 determines in step S216 or step S226 that all the encoded frames in the GOV have been prepared, the reference frame control unit 110 notifies the input / output frame management unit 107 and the encoding unit 111 of a GOV encoding stop request. Then (step S226 in FIG. 4), the switch SW2 is disconnected (step S227 in FIG. 4). On the other hand, the input / output frame management means 107 connects the switch SW5 (step S228 in FIG. 4). As a result, all the encoded frames in the GOV are stored from the buffer buffer 109 via the switch SW5 into the third plural-frame storage memory 117 in the encoded transmission unit 102 (step S229 in FIG. 4). Thereafter, the process proceeds to step S202 corresponding to terminal (2).
[0130]
By performing the processing as described above, the operation of the image coding / transmission apparatus of the present embodiment is performed, and various problems in the conventional method can be solved.
[0131]
Next, the operation of the coding transmission unit 102 in FIG. 1 will be described with reference to the flowcharts in FIGS. First, it is determined whether or not the GOV can be acquired from the buffer buffer 109 in the encoding unit 101 (step S301 in FIG. 5). If the GOV can be acquired, the third multi-frame storage memory 117 receives the GOV from the buffer buffer 109. Is acquired (step S302 in FIG. 5). Subsequently, the encoded bitstream analyzing unit 118 acquires the header information in the GOV from the third multi-frame storage memory 117 (step S303 in FIG. 5), and analyzes the header information (step S304 in FIG. 5). ), An encoding transmission start request is sent to the frame transmission order control means 119 (step S305 in FIG. 5).
[0132]
As a result, the frame transmission order control means 119 switches the switch SW6 based on the predetermined frame transmission order (step S306 in FIG. 5), and the encoded frames stored in the third multi-frame storage memory 117 are changed to the predetermined frame transmission order. The encoded bit stream analyzing means 118 is supplied in the order of frame transmission (step S307 in FIG. 5). The encoded bitstream analyzing unit 118 adds / corrects necessary header information to the input encoded frame (step S308 in FIG. 5), and outputs the encoded frame together with the header information to the transmission packet transmitting unit 121. (Step S309 in FIG. 5). The operations in steps S305 to S309 are repeated until all the encoded frames in the GOV are processed (step S310 in FIG. 5).
[0133]
Next, the transmission packet sending unit 121 determines whether or not the encoded frame can be acquired from the encoded bitstream analyzing unit 118 (step S401 in FIG. 6), and acquires the encoded frame if it can be acquired (FIG. 6). Step S402). Then, the transmission packet sending unit 121 packetizes the acquired encoded frame (step S403 in FIG. 6), and outputs the packetized data to the network or storage unit 122 which is the transmission path (step S404 in FIG. 6). ). Finally, the transmission packet sending unit 121 ends the process when all packetized data is output to the network or storage unit 122 (steps S405 and S404 in FIG. 6).
[0134]
Next, an example of an image receiving / decoding device that receives and decodes an encoded bitstream transmitted from the image encoding / transmission device of the present invention via a transmission path will be described. FIG. 7 is a block diagram showing an example of the image receiving / decoding apparatus. As shown in the figure, the image receiving / decoding apparatus receives at least one packet from the image encoding / transmission apparatus shown in FIG. 1 via the network or storage means 203 (122), and encodes the code from at least one packet. A decoding reception unit 201 for reconstructing a GOV and a decoding unit 202 for decoding the reconstructed GOV and decoding using at least one reference frame And a display frame memory 219 for storing the image signal from the decoding unit 202 for one frame, and a display device 220 for displaying the image signal from the display frame memory 219.
[0135]
In FIG. 7, the decoding receiving unit 201, the switch SW12, and the buffer buffer 211 are components that extend the demultiplexer 402 shown in FIG. 10, and the second multi-frame storage memory 213 is a reference shown in FIG. A configuration in which the prediction memory 406 is expanded, and a configuration in which the other configuration units in FIG. 10 are expanded is a decoding unit 216, reference frame control means 212, switches SW13 to SW15, and input / output frame management means 209.
[0136]
Next, the configuration and operation of the decryption receiving unit 201 will be described. The decoding unit receiving unit 201 includes a transmission packet receiving unit 204, an encoded bitstream analyzing unit 205, a frame storage order control unit 206, a switch SW11, and a first multiple frame storage memory 208. First, at least one packet input from the network or storage unit 203 is received by the transmission packet receiving unit 204 and the encoded frame is reconstructed, and then supplied to the encoded bitstream analyzing unit 205.
[0137]
The encoded bitstream analyzing unit 205 analyzes the input header information and requests the frame storage order control unit 206 to perform switching control of the switch SW11 in order to reconstruct the GOV. Thereby, the frame storage order control means 206 switches the switch SW11 based on a predetermined frame storage order, and supplies the encoded frame from the encoded bitstream analysis means 205 to the first plural-frame storage memory 208 for storage. At the same time, the header information is also stored in the first plural-frame storage memory 208.
[0138]
The first multi-frame storage memory 208 is a memory for storing the encoded frames reconstructed by the transmission packet receiving unit 204 and the encoded bit stream analyzing unit 205, and stores encoded frames for at least 1 GOV. It has a memory capacity that can This memory capacity is secured by calculating in advance from the maximum bit rate of the encoding unit and the maximum bit rate of the transmission path.
[0139]
Next, the configuration and operation of the decoding unit 202 will be described. The decoding unit 202 includes an input / output frame management unit 209, switches SW12 to SW15, a buffer buffer 211, a reference frame control unit 212, a second multi-frame storage memory 213, a decoding unit 216, and a buffer buffer 218. The switches SW13 and SW15 are the same in that the reference frame control means 212 selects one frame image signal from the plurality of frame image signals stored in the second plurality of frame storage memory 213. There is a difference in that the switch SW13 outputs the image signal of the selected frame to the decoding unit 216, and the switch SW15 outputs the image signal of the selected frame to the buffer buffer 218.
[0140]
Further, the input / output frame management means 209 controls the storage of the encoded frame from the first plural-frame storage memory 208 to the buffer buffer 211 in order to control the decoding in units of GOV, Manage by switching. Further, the timing for starting encoding in GOV units is managed by notifying the reference frame control unit 212 of a request to start encoding in GOV units.
[0141]
As a result, when the encoded frames in all GOVs are stored in the first multiple-frame storage memory 208, the switch SW12 in the decoding unit 202 is connected under the control of the input / output frame management means 209, and the first multiple-frame storage memory 208 is connected. The encoded frames in the GOV stored in the frame storage memory 208 are temporarily stored in the buffer buffer 211 sequentially in GOV units.
[0142]
Subsequently, the input / output frame management unit 209 notifies the reference frame control unit 212 of a decoding start request. Then, the reference frame control means 212 connects the switch SW13 and the switch SW14 to a position where the first decoded frame of the GOV can be acquired, and disconnects the switch SW12.
[0143]
Subsequently, the reference frame control unit 212 notifies the decoding unit 216 of a decoding start request. Here, for example, the decoding unit 216 includes at least an entropy decoder 403, an inverse quantizer 404, an inverse orthogonal transformer 405, an adder 408, and a motion compensation predictor 407 of the MPEG decoding apparatus illustrated in FIG. Must be configured. The reference prediction memory 406 shown in FIG. 10 includes the second plural-frame storage memory 213 in FIG.
[0144]
When the decoding unit 216 obtains the encoded frame from the buffer buffer 211, the decoding unit 216 performs entropy decoding by the internal entropy decoder, then performs inverse quantization by the inverse quantizer, and then results of performing inverse orthogonal transform. From the output from the motion compensated predictor that receives the image signal of the reference frame from the second multi-frame storage memory 213 via the switch SW13, a decoded frame of the currently encoded frame is obtained.
[0145]
The reference frame control means 212 switches and controls the switch SW14 so that the reference frame decoded by the decoding unit 216 is correctly stored in the corresponding reference frame position of the second multiple-frame storage memory 213 via the switch SW14. Prepare for decoding of the next encoded frame.
[0146]
The second multi-frame storage memory 213 stores at least the number of frames necessary for configuring the GOV so that the reference frame generated by the decoding unit 216 can be stored at the corresponding frame position in the GOV. It has a frame memory. For example, when the frame rate of the input image is 30 [fps], 1 GOV is 0.5 seconds, and the number of encoded frames stored in 1 GOV is 15, the second multi-frame storage memory 213 has at least 15 frames. The memory capacity that can store the reference frames of minutes is secured.
[0147]
Next, the reference frame control means 212 determines whether all the decoded frames in the GOV are aligned in the second multi-frame storage memory 213 from the current decoded frame position based on a predetermined frame transmission order, If all the decoded frames are not complete, the switch SW14 is switched to a position where the next decoded frame of the GOV can be acquired based on a predetermined frame sending means.
[0148]
When the reference frame control means 212 determines that all the encoded frames in the GOV have been decoded by the decoding unit 216 and aligned in the second multi-frame storage memory 213 by repeating the above operation, The GOV decoding stop request is notified to the switch SW14. On the other hand, the input / output frame management means 209 connects the switch SW15.
[0149]
Thus, after all the image signals (decoded reference frames) in the GOV are supplied from the second plural-frame storage memory 213 to the buffer buffer 218 via the switch SW15 and temporarily stored, the display frame memory 219 is then stored. The image is displayed on the display device 220 after being supplied to and temporarily stored.
[0150]
Note that the present invention is not limited to the above embodiment. For example, in FIG. 1, when the encoded frame generated by the encoding unit 111 is stored in the buffer buffer 109 for 1 GOV, the frame numbers 0, 1 ,..., 14, the encoded frames are stored in the third plural-frame storage memory 117 in the original frame number order, and the frame transmission order control unit 119 encodes the encoding unit 111 in the order (FIG. 2). The switch SW6 may be controlled so that the encoded frame having the corresponding frame number is read from the third multi-frame storage memory 117 in the order of transmission shown in (C). However, in this case, transmission frame order determination means for determining the transmission frame order according to the flowchart shown in FIG. 3 is provided in advance in, for example, the encoding unit 101, and the transmission frame order determined by the transmission frame order determination means. The reference frame control means 110 and the frame sending order control means 119 need to be able to refer to each other.
[0151]
The first plurality of frames storing memory 106 may have a memory capacity capable of storing one image frame.
[0152]
In addition, even if the transmission bit rate of the transmission path is reduced by considering the encoding method of the encoded frame in the GOV and the transmission order, the generated GOV cannot be transmitted in a certain time. By discarding the remaining GOVs that could not be transmitted, the number of encoded frames to be transmitted is limited, and by considering the encoding method and transmission order of the encoded frames in the GOV, the image reception decoding device side The frame rate automatically changes according to the number of encoded frames included in the received GOV, and the transmission bit rate of the transmission path is constantly monitored by enabling the frames to be reproduced and displayed at equal intervals as much as possible. Alternatively, the encoding bit rate may be controlled.
[0153]
Furthermore, considering the encoding method and transmission order of the encoded frames in the GOV, even if the transmission bit rate of the transmission path is reduced, the generated GOV cannot be transmitted in a certain time. By discarding the remaining GOVs that could not be transmitted, the number of encoded frames to be transmitted is limited, and by considering the encoding method and transmission order of the encoded frames in the GOV, the image reception decoding device side The frame rate may be automatically changed according to the number of encoded frames included in the received GOV so that the frames can be reproduced and displayed at equal intervals as much as possible. It increases the transmission delay due to a certain time difference required to constantly monitor and detect the change in the transmission bit rate, and reduces the influence on the GOV decoding. It can be carried out.
[0154]
【The invention's effect】
As described above, according to the present invention, at the time of encoding by the encoding unit, the frame stored in the second frame storage memory is encoded from the frames encoded before the current frame to be encoded. Since an arbitrary frame is used as a reference frame, the reference frame that is closest in time can be used. Therefore, information in the GOV is completely reconstructed due to packet loss due to a change in transmission rate. It is possible to send out the encoded bit stream in a predetermined sending frame order even for a transmission line in which the transmission rate fluctuates drastically.
[0155]
Further, according to the present invention, the frame rate on the decoding side is automatically changed according to the number of frames in the GOV accumulated up to the decoding time using the information of the GOV transmitted up to the decoding time, and as much as possible. When an encoded bit stream is transmitted to a transmission path where the transmission rate fluctuates significantly because the encoded bit stream can be transmitted to the image reception encoding apparatus in a predetermined transmission frame order so that frames can be reproduced and displayed at equal intervals. However, the content of the image is not completely lost, and transmission that allows the image receiving and coding apparatus to grasp the content of the entire image can be performed.
[0156]
In addition, according to the present invention, since encoded frames are transmitted in a predetermined frame transmission order so that frames can be reproduced and displayed at equal intervals as much as possible, encoding is performed without always monitoring the transmission bit rate of the transmission path. The bit rate can be controlled. Further, according to the present invention, the transmission bit rate of the transmission line is constantly monitored to detect a change in the transmission bit rate. Can be reduced. As described above, according to the present invention, it is possible to realize image transmission with higher quality than before.
[Brief description of the drawings]
FIG. 1 is a block diagram of an embodiment of an image encoding and transmitting apparatus of the present invention.
FIG. 2 is a diagram for explaining an example of a frame transmission order and a method of using a reference frame at the time of encoding by the image encoding / transmission apparatus of the present invention.
FIG. 3 is a flowchart showing an embodiment of a process for obtaining a frame transmission order according to the present invention.
FIG. 4 is a flowchart for explaining the operation of an encoding unit in the image encoding transmission apparatus of the present invention.
FIG. 5 is a flowchart (No. 1) for explaining the operation of the encoding transmission unit in the image encoding transmission apparatus of the present invention;
FIG. 6 is a flowchart (part 2) for explaining the operation of the encoding transmission unit in the image encoding transmission device of the present invention.
FIG. 7 is a block diagram of an example of an image receiving / decoding apparatus that decodes an encoded frame output by an image encoding / transmission apparatus of the present invention.
FIG. 8 is a block diagram of an example of a conventional encoding apparatus based on MPEG technology.
FIG. 9 is a diagram representing a GOV header in MPEG-4.
FIG. 10 is a block diagram of an example of a conventional decoding device based on MPEG technology.
FIG. 11 is a block diagram of an example of a conventional image encoding / transmission apparatus.
FIG. 12 is a block diagram of an example of a conventional image receiving / decoding apparatus.
[Explanation of symbols]
101 Coding unit
102 Encoded transmission unit
103 Input means (input image)
104, 109, 211, 218 Buffer buffer
106, 208 First multiple frame storage memory
107, 209 Input / output frame management means
110, 212 Reference frame control means
111 Encoding part
113, 213 Second memory for storing a plurality of frames
116 Local decoding part
117 Third multi-frame storage memory
118, 205 Coding bitstream analysis means
119 Frame transmission order control means
121 Transmission packet sending means
122, 203 Network or storage means
201 Decoding receiver
202 Decryption unit
204 Transmission packet receiving means
206 Frame storage order control means
209 I / O frame management means
216 Decoding part
219 Display frame memory
220 Display device
SW1 to SW6, SW11 to SW14 switch

Claims

An image encoding transmission apparatus that encodes an input image signal using a reference frame and transmits the obtained encoded frame according to a predetermined frame transmission order,
A first frame storage memory for storing the input image signal for at least one frame;
A second frame storage memory for storing a predetermined number of reference frames used at the time of encoding;
First switch means provided on the output side of the first frame storage memory;
Second switch means provided on the input side and output side of the second frame storage memory;
In accordance with the predetermined frame transmission order, the first and second switch means are controlled to control the transfer of the input image signal and the reference frame stored in the first and second frame storage memories. Reference frame control means,
An input image signal having a frame number selected by the first switch means from the input image signals stored in the first frame storage memory is stored in the second frame storage memory. An encoding unit that encodes and obtains an encoded frame by using one reference frame selected by the second switch means from among the reference frames;
The encoded frame output from the encoding unit is decoded, and one reference frame used at the time of encoding by the encoding unit is used from among the reference frames stored in the second frame storage memory. A local decoding unit for newly generating a reference frame and storing the new reference frame in the second frame storage memory via the second switch means;
The storage state of the first frame storage memory is monitored to manage the storage state, the encoded frame output from the encoding unit is monitored, the encoded output is managed, and these states are managed. I / O frame management means for notifying the reference frame control means,
A third frame storage memory for storing a group of the encoded frames generated by the encoding unit;
And an output means for packetizing each frame included in the group of encoded frames stored in the third frame storage memory and transmitting the packets in accordance with the predetermined frame transmission order. Transmission equipment.

The reference frame control means is means for controlling the first and second switch means with the transmission frame number determined by the calculation means as the predetermined frame transmission order in the determined order. Is
Assuming that the number of encoded frames included in the group of encoded frames is M, and the frame number of the input image signal stored in the first frame storage memory is a natural number from 0 to M−1. Initializing means for initializing the initial value of the variable A to M and the initial value of the count value C to 1 with the frame number 0 as the first transmission frame number;
After initialization by the initialization means, the following formula B = [(A + 1) / 2] ([] is a Gaussian symbol)
And calculating a value of B, and a first calculating means for setting the variable n to 0,
Using the value of B and the value of variable n obtained by the first calculating means, the following formula D = B + 2 × B × n (n is a positive integer)
Second calculating means for calculating D by:
First determination means for determining whether or not the value of D calculated by the second calculation means is less than M;
When it is determined by the first determination means that D <M, the value of D at that time is determined as a transmission frame number, and the variable n and the count value C are incremented by 1, respectively. Sending frame number determining means for calculating D again by the calculating means of 2;
Second determination means for determining whether or not the count value C is greater than M when the first determination means determines that D is equal to or greater than M;
When the second determination means determines that the count value C is less than or equal to M, the value of A is set to the value of B at that time, and then the first calculation means determines the B 2. The setting means for calculating the value, wherein the calculation means performs calculation until the second determination means determines that the count value C is greater than the M. Image encoding transmission apparatus.