JP3700801B2

JP3700801B2 - Image coding apparatus and image coding method

Info

Publication number: JP3700801B2
Application number: JP29534296A
Authority: JP
Inventors: 信弥伊木; 元樹加藤; 聡三橋; 裕司安藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1996-11-07
Filing date: 1996-11-07
Publication date: 2005-09-28
Anticipated expiration: 2016-11-07
Also published as: JPH10145792A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像符号化装置および画像符号化方法に関し、特に、例えば、動画像を、光磁気ディスクや磁気テープなどの記録媒体に記録したり、テレビ会議システムや、テレビ電話システム、放送用機器などにおいて、動画像を、伝送路を介して、送信側から受信側に伝送する場合などに用いて好適な画像符号化装置および画像符号化方法に関する。
【０００２】
【従来の技術】
例えば、動画像をディジタル化して記録したり、伝送する場合においては、そのデータ量が膨大であることから、従来より、画像データを圧縮符号化することが行われている。動画像の代表的な符号化方式としては、動き補償予測符号化などがある。
【０００３】
動き補償予測符号化は、画像の時間軸方向の相関を利用する符号化方法で、図１３に示すように、参照する画像（参照画像）（参照フレーム）に対する、符号化対象の画像（符号化対象画像）（現フレーム）の動きベクトルを検出し、その動きベクトルにしたがって、既に符号化されて復号化された参照画像を動き補償することにより、予測画像を生成する。そして、この予測画像に対する、符号化対象画像の予測残差を求め、この予測残差と動きベクトルを符号化することにより、動画像の情報量が圧縮される。
【０００４】
動き補償予測符号化の具体的なものとしては、ＭＰＥＧ（Moving Picture Experts Group）符号化がある。これは、ＩＳＯ（国際標準化機構）とＩＥＣ（国際電気標準会議）のＪＴＣ（Joint Technical Committee）１のＳＣ（Sub Committee）９のＷＧ（Working Group）１１においてまとめられた動画像符号化方式の通称である。
【０００５】
ＭＰＥＧでは、１フレームまたは１フィールドが、１６ライン×１６画素で構成されるマクロブロックに分割され、このマクロブロック単位で、動き補償予測符号化が行われる。
【０００６】
ここで、動き補償予測符号化には、大別して、イントラ符号化と、インター符号化（非イントラ符号化）の２つの符号化方式がある。イントラ符号化では、符号化対象のマクロブロックに関して、自身の情報がそのまま符号化され、インター符号化では、他の時刻のフレーム（またはフィールド）を参照画像として、その参照画像から生成される予測画像と、自身の情報との差分が符号化される。
【０００７】
ＭＰＥＧでは、各フレームが、Ｉピクチャ（Intra coded picture）、Ｐピクチャ（Predictive coded picture）、またはＢピクチャ（Bidirectionally predictive picture）のうちのいずれかとして符号化される。また、ＭＰＥＧでは、ＧＯＰ（Group Of Picture）単位で処理が行われる。
【０００８】
即ち、ＭＰＥＧにおいては、ＧＯＰは、例えば、図１４に示すように、１７フレームで構成される。そして、いま、このＧＯＰを構成するフレームを、その先頭から、Ｆ１，Ｆ２，・・・，Ｆ１７とするとき、例えば、同図に示すように、フレームＦ１はＩピクチャとして、フレームＦ２はＢピクチャとして、フレームＦ３はＰピクチャとして処理される。その後のフレームＦ４乃至Ｆ１７は、交互に、ＢピクチャまたはＰピクチャとして処理される。
【０００９】
Ｉピクチャはイントラ符号化されるが、ＰおよびＢピクチャは、基本的に、インター符号化される。即ち、Ｐピクチャは、図１４（Ａ）に矢印で示すように、基本的には、その直前のＩまたはＰピクチャを参照画像として用いて、インター符号化される。Ｂピクチャは、図１４（Ｂ）に矢印で示すように、基本的には、その直前のＩまたはＰピクチャと、その直後のＰピクチャとの両方、あるいは、そのいずれか一方を参照画像として用いて、インター符号化される。
【００１０】
より具体的には、図１５に示すように、まず、フレームＦ１がＩピクチャとして処理される。即ち、そのすべてのマクロブロックはイントラ符号化され（ＳＰ１）、伝送データＦ１Ｘとして、伝送路を介して伝送される。
【００１１】
次に、時間的に後行する画像（未来の画像）を参照画像とする可能性のあるＢピクチャであるフレームＦ２をスキップして、ＰピクチャであるフレームＦ３が先に処理される。フレームＦ３については、その直前のＩピクチャであるフレームＦ１を参照画像として、その参照画像から生成される予測画像に対する予測残差が求められ（順方向予測符号化され）（ＳＰ３）、これが、フレームＦ１に対する動きベクトルｘ３とともに、伝送データＦ３Ｘとして伝送される。あるいは、また、フレームＦ３は、フレームＦ１と同様にイントラ符号化され（ＳＰ１）、伝送データＦ３Ｘとして伝送される。Ｐピクチャを、イントラ符号化するか、または順方向予測符号化するかは、マクロブロック単位で切り換えることができる。
【００１２】
フレームＦ３の符号化後は、ＢピクチャであるフレームＦ２が処理される。Ｂピクチャは、イントラ符号化、順方向予測符号化、逆方向予測符号化、または双方向予測符号化される。
【００１３】
即ち、イントラ符号化では、フレームＦ２は、フレームＦ１と同様に、そのデータがそのまま伝送データＦ２Ｘとして伝送される（ＳＰ１）。
【００１４】
順方向予測符号化では、フレームＦ２は、その直前の（時間的に先行する）ＩまたはＰピクチャであるフレームＦ１を参照画像として、その参照画像から生成される予測画像に対する予測残差が求められ（順方向予測符号化され）（ＳＰ３）、これが、フレームＦ１に対する動きベクトルｘ１とともに、伝送データＦ２Ｘとして伝送される。
【００１５】
逆方向予測符号化では、フレームＦ２は、その直後の（時間的に後行する）ＩまたはＰピクチャであるフレームＦ３を参照画像として、その参照画像から生成される予測画像に対する予測残差が求められ（逆方向予測符号化され）（ＳＰ２）、これが、フレームＦ３に対する動きベクトルｘ２とともに、伝送データＦ２Ｘとして伝送される。
【００１６】
双方向予測符号化では、フレームＦ２は、フレームＦ１とＦ３を参照画像として、その参照画像から生成される予測画像の平均値などに対する予測残差が求められ（双方向予測符号化され）（ＳＰ４）、これが、フレームＦ１とＦ３に対する動きベクトルｘ１とｘ２とともに、伝送データＦ２Ｘとして伝送される。
【００１７】
なお、Ｂピクチャを、イントラ符号化、順方向予測符号化、逆方向予測符号化、または双方向予測符号化のうちのいずれで符号化するかも、Ｐピクチャと同様に、マクロブロック単位で切り換えることができる。
【００１８】
また、イントラ符号化に対して、順方向予測符号化、逆方向予測符号化、および双方向予測符号化が、インター符号化（非イントラ符号化）と呼ばれる。
【００１９】
ここで、以下、適宜、時間的に先行または後行する参照画像を、過去参照画像または未来参照画像という。
【００２０】
【発明が解決しようとする課題】
画像符号化装置には、Ｂピクチャのマクロブロックを符号化させる際に、イントラ符号化、順方向予測符号化、逆方向予測符号化、または両方向予測符号化のうちの、最も符号化効率の良い予測モードを選択させるのが望ましい。
【００２１】
そこで、Ｂピクチャを、上述の４つの予測モードそれぞれで符号化し、その結果得られるデータ量の最も少ないものを選択する方法がある。
【００２２】
しかしながら、この方法では、４つの予測モードそれぞれで符号化する必要があるため、処理に時間を要し、あるいは、装置規模が大きくなる。
【００２３】
そこで、過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとを検出（ＭＥ（Motion Estimation））し、順方向動きベクトルまたは逆方向動きベクトルに対応して過去参照画像または未来参照画像をそれぞれ動き補償することにより予測画像を求め、それぞれの予測画像に対する、符号化対象の画像の予測残差（ME Error）（以下、適宜、動きベクトル推定残差ともいう）に対応して、Ｂピクチャの予測モードを決定する方法（正確には、３種類のインター符号化（順方向予測符号化、逆方向予測符号化、および両方向予測符号化）のうちの１つを選択する方法））を、本件出願人は先に提案している。
【００２４】
この方法（以下、適宜、第１の方法という）においては、まず最初に、例えば、符号化対象のマクロブロックと、参照画像を動き補償して得られる予測マクロブロックとの、各画素値の差分の絶対値和が、動きベクトル推定残差として求められる。
【００２５】
そして、過去参照画像または未来参照画像に対する動きベクトル推定残差を、それぞれＥｆまたはＥｂとするとき、インター符号化の中のどれを用いるかが、例えば、図１６に示すように決定される。
【００２６】
即ち、式Ｅｂ＞ｊ×Ｅｆが成り立つ場合、順方向予測符号化が選択され、式Ｅｂ＜ｋ×Ｅｆが成り立つ場合、逆方向予測符号化が選択される。そして、これら以外の場合、即ち、式ｋ×Ｅｆ≦Ｅｂ≦ｊ×Ｅｆが成り立つ場合、双方向予測符号化が選択される。
【００２７】
なお、０＜ｋ＜ｊで、図１６においては、ｊ＝２，ｋ＝１／２としてある。
【００２８】
ここで、本明細書中において、記号＜，＞は、記号≦，≧としても良い。同様に、記号≦，≧は、記号＜，＞としても良い。
【００２９】
従って、順方向動きベクトルによる予測残差Ｅｆが、逆方向動きベクトルによる予測残差Ｅｂに比べ、比較的小さい場合（図１６では、１／２未満（以下）である場合）、順方向予測符号化が選択される。また、逆方向動きベクトルによる予測残差Ｅｂが、順方向動きベクトルによる予測残差Ｅｆに比べ、比較的小さい場合（図１６では、１／２未満（以下）である場合）、逆方向予測符号化が選択される。さらに、予測残差ＥｆとＥｂとの比がそれほど大きなものおよび小さなものでない場合（図１６では、Ｅｆ／Ｅｂが１／２以上（より大きく）、かつ２以下（未満）の場合）、双方向予測が選択される。
【００３０】
ところで、画像のシーケンスが、図１４に示したように、ＩまたはＰピクチャの間に、１枚（フレームまたはフィールド）のＢピクチャが配置されて構成されている場合においては、Ｂピクチャに対する過去参照画像または未来参照画像それぞれとなるＩあるいはＰピクチャ（Ｉ／Ｐピクチャ）から、そのＢピクチャまでの時間的な距離が、いずれも同一であるから、第１の方法によって、符号化効率の向上を図ることができる。
【００３１】
しかしながら、画像のシーケンスが、ＩまたはＰピクチャの間に、２枚以上のＢピクチャが配置されて構成されている場合、即ち、例えば、図１７に示すように、２枚のＢピクチャが配置されて構成されている場合においては、インター符号化の中で、順方向予測符号化または逆方向予測符号化が、最も符号化効率が高いのにも拘らず、双方向予測符号化が選択されることがあった。
【００３２】
なお、このことは、本件発明者が行ったシミュレーションにより確認している。
【００３３】
これは、図１７に示すように、Ｂピクチャから、その過去参照画像または未来参照画像それぞれとなるＩ／Ｐピクチャまでの距離が異なることに起因する。
【００３４】
即ち、２枚のＢピクチャが配置されている場合においては、１枚目のＢピクチャについては、未来参照画像までの距離の方が、過去参照画像までの距離より遠くなり、２枚目のＢピクチャについては、その逆に、過去参照画像までの距離の方が、未来参照画像までの距離より遠くなる。従って、１枚目のＢピクチャについては、逆方向動きベクトルによる予測精度が劣化し、２枚目のＢピクチャについては、順方向動きベクトルによる予測精度が劣化する。
【００３５】
そこで、本件出願人は、Ｂピクチャから過去参照画像または未来参照画像それぞれまでの距離を考慮して、予測モードを決定することにより、過去参照画像と未来参照画像との間に、２枚以上のＢピクチャが配置されていても、画像を効率良く符号化することが可能な方法（以下、適宜、第２の方法という）を、先に提案している（例えば、特願平７−２１０６６５号）。
【００３６】
この第２の方法では、符号化対象のＢピクチャが、過去参照画像または未来参照画像のうちのいずれに近いかによって、インター符号化の中から１つを選択する条件が変更されるようになされている。
【００３７】
即ち、符号化対象のＢピクチャが、過去参照画像に近い場合（例えば、図１７におけるフレームＦ２や、Ｆ５，Ｆ８，・・・）、図１８（Ａ）に示すように、式Ｅｂ＞ａ×Ｅｆが成り立つとき、順方向予測符号化が選択され、式Ｅｂ＜ｂ×Ｅｆが成り立つとき、逆方向予測符号化が選択される。また、式ｂ×Ｅｆ≦Ｅｂ≦ａ×Ｅｆが成り立つとき、双方向予測符号化が選択される。
【００３８】
但し、０＜ｂ＜ａで、また、ａは、図１６におけるｊより小さい値である。図１８（Ａ）においては、ａ＝４／３，ｂ＝１／２としてある。
【００３９】
一方、符号化対象のＢピクチャが、未来参照画像に近い場合（例えば、図１７におけるフレームＦ３や、Ｆ６，Ｆ９，・・・）、図１８（Ｂ）に示すように、式Ｅｂ＞ｃ×Ｅｆが成り立つとき、順方向予測符号化が選択され、式Ｅｂ＜ｄ×Ｅｆが成り立つとき、逆方向予測符号化が選択される。また、式ｄ×Ｅｆ≦Ｅｂ≦ｃ×Ｅｆが成り立つとき、双方向予測符号化が選択される。
【００４０】
但し、０＜ｄ＜ｃで、また、ｄは、図１６におけるｋより大きい値である。図１８（Ｂ）においては、ｃ＝２，ｄ＝３／４としてある。
【００４１】
以上のようにすることで、符号化対象のＢピクチャが、過去参照画像に近い場合には、その過去参照画像のみを用いる順方向予測符号化が選択され易くなり、また、未来参照画像に近い場合には、その未来参照画像のみを用いる逆方向予測符号化が選択され易くなる。従って、予測精度が高い参照画像だけを用いて予測符号化され易くなり、その結果、符号化効率を向上させることができる。
【００４２】
しかしながら、第２の方法によれば、例えば、動きの遅い画像、あるいは、物体が、水平方向にパンしているなど、一定の単純な動きをしている画像などを符号化対象とすると、符号化効率が若干低下する場合があった。
【００４３】
即ち、動きの遅い画像や、一定の単純な動きをしている画像については、順方向予測符号化または逆方向予測符号化するより、双方向予測符号化する方が予測精度が高くなり、従って、符号化効率も高くなる。しかしながら、第２の方法では、図１８に示したように、図１６における場合に比較して、双方向予測符号化が選択される範囲を狭くして、順方向予測符号化または逆方向予測符号化が選択される範囲を広くしている。これにより、第２の方法によれば、動きの遅い画像や、一定の単純な動きをしている画像を符号化する場合においても、双方向予測符号化より、順方向予測符号化または逆方向予測符号化が選択され易く、その結果、符号化効率が劣化する。
【００４４】
一方、従来においては、動きベクトルの伝送に必要なビット量を考慮せずに、インター符号化の選択（順方向予測符号化、逆方向予測符号化、または双方向予測符号化のうちのいずれか１つの選択）を行っていた。
【００４５】
即ち、従来においては、基本的に、順方向予測符号化、逆方向予測符号化、または双方向予測符号化のうちの、予測残差が最も小さいものが選択されるようになされていた。
【００４６】
しかしながら、例えば、順方向予測符号化、逆方向予測符号化、および双方向予測符号化についてのいずれの予測残差も小さい場合においては、そのうちの双方向予測符号化についてのものが最も小さくても、動きベクトルの伝送に要するビット量をも考慮すると、双方向予測符号化よりも、順方向予測符号化または逆方向予測符号化の方が、符号化効率が良くなることがあった。
【００４７】
なお、このようなケースは、例えば、動きの速い画像を符号化する場合に生じることが多かった。
【００４８】
本発明は、このような状況に鑑みてなされたものであり、画像の符号化効率を、より向上させることができるようにするものである。
【００４９】
【課題を解決するための手段】
本発明の画像符号化装置は、時間的に先行する過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、時間的に後行する未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとを検出する動きベクトル検出手段と、過去参照画像または未来参照画像それぞれに対する、符号化対象の画像の予測残差に基づいて、符号化対象の画像の予測モードを順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのいずれかに決定する予測モード決定手段と、予測モードに対応する動き補償を行うことにより、予測画像を生成する動き補償手段と、符号化対象の画像と、予測画像との差分値を演算する差分値演算手段と、差分値を符号化する符号化手段とを備え、予測モード決定手段は、過去参照画像または未来参照画像それぞれに対する予測残差をＥｆまたはＥｂとするとともに、α，β，γ，δを所定の定数とする場合において（但し、α，β，γ，δは実数であり、γ＜β）、式Ｅｂ＞α×ＥｆかつＥｂ＞β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像のみから予測画像を生成する順方向予測符号化モードに、予測モードを決定し、式Ｅｂ≦α×ＥｆかつＥｂ＜γ×Ｅｆ＋（１−α×γ）×δが成り立つとき、未来参照画像のみから予測画像を生成する逆方向予測符号化モードに、予測モードを決定し、式γ×Ｅｆ＋（１−α×γ）×δ≦ＥｂかつＥｂ≦β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像および未来参照画像の両方から予測画像を生成する双方向予測符号化モードに、予測モードを決定し、順方向動きベクトルまたは逆方向動きベクトルの大きさが大きいほどにδを大きくすることで、双方向予測符号化モードが決定されにくくすることを特徴とする。
【００５０】
本発明の画像符号化方法は、時間的に先行する過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、時間的に後行する未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとを検出し、過去参照画像または未来参照画像それぞれに対する、符号化対象の画像の予測残差に基づいて、符号化対象の画像の予測モードを順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのいずれかに決定し、予測モードに対応する動き補償を行うことにより、予測画像を生成し、符号化対象の画像と、予測画像との差分値を演算し、差分値を符号化し、過去参照画像または未来参照画像それぞれに対する予測残差をＥｆまたはＥｂとするとともに、α，β，γ，δを所定の定数とする場合において（但し、α，β，γ，δは実数であり、γ＜β）、式Ｅｂ＞α×ＥｆかつＥｂ＞β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像のみから予測画像を生成する順方向予測符号化モードに、予測モードを決定し、式Ｅｂ≦α×ＥｆかつＥｂ＜γ×Ｅｆ＋（１−α×γ）×δが成り立つとき、未来参照画像のみから予測画像を生成する逆方向予測符号化モードに、予測モードを決定し、式γ×Ｅｆ＋（１−α×γ）×δ≦ＥｂかつＥｂ≦β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像および未来参照画像の両方から予測画像を生成する双方向予測符号化モードに、予測モードを決定し、順方向動きベクトルまたは逆方向動きベクトルの大きさが大きいほどにδを大きくすることで、双方向予測符号化モードが決定されにくくすることを特徴とする。
【００５１】
本発明の画像符号化装置においては、時間的に先行する過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、時間的に後行する未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとが検出され、過去参照画像または未来参照画像それぞれに対する、符号化対象の画像の予測残差に基づいて、符号化対象の画像の予測モードが順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのいずれかに決定され、予測モードに対応する動き補償が行われることにより、予測画像が生成され、符号化対象の画像と、予測画像との差分値が演算され、差分値が符号化され、過去参照画像または未来参照画像それぞれに対する予測残差をＥｆまたはＥｂとするとともに、α，β，γ，δを所定の定数とする場合において（但し、α，β，γ，δは実数であり、γ＜β）、式Ｅｂ＞α×ＥｆかつＥｂ＞β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像のみから予測画像を生成する順方向予測符号化モードに、予測モードが決定され、式Ｅｂ≦α×ＥｆかつＥｂ＜γ×Ｅｆ＋（１−α×γ）×δが成り立つとき、未来参照画像のみから予測画像を生成する逆方向予測符号化モードに、予測モードが決定され、式γ×Ｅｆ＋（１−α×γ）×δ≦ＥｂかつＥｂ≦β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像および未来参照画像の両方から予測画像を生成する双方向予測符号化モードに、予測モードが決定され、順方向動きベクトルまたは逆方向動きベクトルの大きさが大きいほどにδが大されることで、双方向予測符号化モードが決定されにくくされる。
【００５２】
本発明の画像符号化方法においては、時間的に先行する過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、時間的に後行する未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとが検出され、過去参照画像または未来参照画像それぞれに対する、符号化対象の画像の予測残差に基づいて、符号化対象の画像の予測モードが順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのいずれかに決定され、予測モードに対応する動き補償が行われることにより、予測画像が生成され、符号化対象の画像と、予測画像との差分値が演算され、差分値が符号化され、過去参照画像または未来参照画像それぞれに対する予測残差をＥｆまたはＥｂとするとともに、α，β，γ，δを所定の定数とする場合において（但し、α，β，γ，δは実数であり、γ＜β）、式Ｅｂ＞α×ＥｆかつＥｂ＞β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像のみから予測画像を生成する順方向予測符号化モードに、予測モードが決定され、式Ｅｂ≦α×ＥｆかつＥｂ＜γ×Ｅｆ＋（１−α×γ）×δが成り立つとき、未来参照画像のみから予測画像を生成する逆方向予測符号化モードに、予測モードが決定され、式γ×Ｅｆ＋（１−α×γ）×δ≦ＥｂかつＥｂ≦β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像および未来参照画像の両方から予測画像を生成する双方向予測符号化モードに、予測モードが決定され、順方向動きベクトルまたは逆方向動きベクトルの大きさが大きいほどにδが大されることで、双方向予測符号化モードが決定されにくくされる。
【００５３】
【発明の実施の形態】
以下に、本発明の実施の形態を説明するが、その前に、特許請求の範囲に記載の発明の各手段と以下の実施の形態との対応関係を明らかにするために、各手段の後の括弧内に、対応する実施の形態（但し、一例）を付加して、本発明の特徴を記述すると、次のようになる。
【００５４】
即ち、請求項１に記載の画像符号化装置は、時間的に先行する過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、時間的に後行する未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとを検出する動きベクトル検出手段（例えば、図６に示す動きベクトル推定回路６など）と、順方向動きベクトルまたは逆方向動きベクトルに対応して、符号化対象の画像の予測モードを決定する予測モード決定手段（例えば、図７に示す予測モード決定回路２１など）と、予測モードに対応する動き補償を行うことにより、予測画像を生成する動き補償手段（例えば、図７に示す動き補償回路２０など）と、符号化対象の画像と、予測画像との差分値を演算する差分値演算手段（例えば、図７に示す演算部１１など）と、差分値を符号化する符号化手段（例えば、図７に示すＤＣＴ回路１２や、量子化回路１３、可変長符号化回路１５など）とを備えることを特徴とする。
【００５５】
なお、勿論この記載は、各手段を上記したものに限定することを意味するものではない。
【００５６】
次に、本発明の原理について説明する。
【００５７】
動画像においては、一般に、画像どうしの時間軸方向の相関は、その画像どうしの距離（間隔）が大きくなるほど小さくなる。
【００５８】
従って、例えば、図１４と同一の図１に示すような、Ｉ／Ｐピクチャの間に１枚のＢピクチャが配置されたシーケンスにおいては、Ｂピクチャと、過去参照画像または未来参照画像それぞれとの相関は等しく、その結果、過去参照画像および未来参照画像に対する動きベクトル推定残差Ｅｆ，Ｅｂについての統計的な性質も等しくなる。
【００５９】
一方、例えば、図１７と同一の図２に示すような、Ｉ／Ｐピクチャの間に２枚以上のＢピクチャが配置されたシーケンスにおいては、Ｂピクチャと、過去参照画像または未来参照画像それぞれとの相関は、その距離に対応して変化する。
【００６０】
このため、例えば、図３に示すように、ＰピクチャＰ_nとＰ_n+4との間に、３枚のＢピクチャＢ_n+1，Ｂ_n+2，Ｂ_n+3が配置されている場合において、この３枚のＢピクチャＢ_n+1，Ｂ_n+2，Ｂ_n+3を、ＰピクチャＰ_nまたはＰ_n+4それぞれを過去参照画像または未来参照画像として予測符号化すると、過去参照画像Ｐ_nに対するＢピクチャＢ_n+1，Ｂ_n+2，Ｂ_n+3それぞれの動きベクトル残差Ｅ_f1，Ｅ_f2，Ｅ_f3は、一般に、Ｅ_f1＜Ｅ_f2＜Ｅ_f3の関係になる。
【００６１】
同様に、未来参照画像Ｐ_n+4に対するＢピクチャＢ_n+1，Ｂ_n+2，Ｂ_n+3それぞれの動きベクトル残差Ｅ_b1，Ｅ_b2，Ｅ_b3は、一般に、Ｅ_b1＞Ｅ_b2＞Ｅ_b3の関係になる。
【００６２】
以上のように、Ｉ／Ｐピクチャの間に、２枚以上のＢピクチャが配置されている場合には、各Ｂピクチャについて、過去参照画像または未来参照画像それぞれまでの距離が異なるため、その相関も異なる。その結果、過去参照画像または未来参照画像に対する動きベクトル残差それぞれの統計的性質も、各Ｂピクチャによって異なり、従って、符号化効率を向上させるには、各Ｂピクチャを符号化する際の予測モードの決定方法を、その統計的性質に応じて変える必要がある。
【００６３】
次に、双方向予測符号化による予測精度は、一般に、画像の動きが速いほど低下する。このため、双方向予測符号化による場合には、順方向動きベクトルと逆方向動きベクトルとの両方を伝送しなければならないことをも考慮すると、画像の動きが速い場合には、双方向予測符号化による予測残差が最も小さいときであっても、符号化対象のＢピクチャから時間的に最も近い参照画像のみを用いて予測符号化を行う方が、発生する全体のデータ量が少なくなることが多い。
【００６４】
一方、画像の動きの速さは、例えば、動きベクトルをＭＶと表し、そのｘ成分（水平方向の成分）をｖ_xと、ｙ成分（垂直方向の成分）をｖ_yと表すとき、動きベクトルの大きさ｜ＭＶ｜＝（ｖ_x ²＋ｖ_y ²）^1/2で表すことができる。
【００６５】
そこで、Ｉ／Ｐピクチャの間に、例えば、図２に示したように、２枚のＢピクチャが配置されている場合においては、動きベクトルの大きさ｜ＭＶ｜に対応して、次のように予測モードを設定することにより、符号化効率を向上させることができる。
【００６６】
即ち、いま、符号化対象のＢピクチャから、過去参照画像または未来参照画像までのフレーム数を、それぞれＤｆまたはＤｂとすると、Ｄｆ＝１およびＤｂ＝２の場合（符号化対象のＢピクチャからの距離が、過去参照画像の方が近い場合）、例えば、図４（Ａ）に示すように、式Ｅｂ＞ｐ×ＥｆかつＥｂ＞ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔ_iが成り立つとき、順方向予測符号化を選択し、式Ｅｂ≦ｐ×ＥｆかつＥｂ＜ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔ_iが成り立つとき、逆方向予測符号化を選択する。また、式ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔ_i≦ＥｂかつＥｂ≦ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔ_iが成り立つとき、双方向予測符号化を選択する。
【００６７】
ここで、Ｔ_iは０以上の定数で、０＜ｒ＜ｑであり、また、ｑは、図１６におけるｊより小さい値である。図４（Ａ）においては、ｑ＝５／４，ｒ＝３／４となっている。また、ｐ＝１となっている。
【００６８】
この場合、予測残差ＥｆがＴ_i未満（以下）か、または予測誤差Ｅｂがｐ×Ｔ_i未満となるときは、双方向予測符号化は選択されない。即ち、この場合、双方向予測符号化は、予測残差ＥｆがＴ_i以上となる（より大きくなる）か、または予測誤差Ｅｂがｐ×Ｔ_i以上となるときに限り選択され得る。
【００６９】
従って、この場合、動きベクトルの大きさ｜ＭＶ｜が大きくなるにつれて、定数Ｔ_iを大きな値に設定することにより、双方向予測符号化が選択され難くなる。
【００７０】
即ち、Ｔ₁＜Ｔ₂＜・・・＜Ｔ_n＜Ｔ_n+1、および０＜ｍｖ₀＜ｍｖ₁＜・・・＜ｍｖ_n-1＜ｍｖ_nとする場合において、動きベクトルの大きさ｜ＭＶ｜が、ｍｖ₀以上ｍｖ₁未満のときは、Ｔ_iをＴ₁に、ｍｖ₁以上ｍｖ₂未満のときは、Ｔ_iをＴ₂に、・・・、ｍｖ_n-1以上ｍｖ_n未満のときは、Ｔ_iをＴ_nに、ｍｖ_n以上のときは、Ｔ_iをＴ_n+1に設定する。このようにすることで、画像の動きが速いほど、予測精度の低下し、動きベクトルに割り当てるビット量が大きく増加する双方向予測符号化が選択され難くなり、その結果、符号化効率を向上させることができる。
【００７１】
また、この場合、符号化対象のＢピクチャが、過去参照画像に近いことから、その過去参照画像のみを用いる順方向予測符号化が選択され易くなっているので、この点からも、符号化効率を向上させることができる。
【００７２】
一方、Ｄｆ＝２およびＤｂ＝１の場合（符号化対象のＢピクチャからの距離が、未来参照画像の方が近い場合）、例えば、図４（Ｂ）に示すように、式Ｅｂ＞ｓ×ＥｆかつＥｂ＞ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔ_iが成り立つとき、順方向予測符号化を選択し、式Ｅｂ≦ｓ×ＥｆかつＥｂ＜ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔ_iが成り立つとき、逆方向予測符号化を選択する。また、式ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔ_i≦ＥｂかつＥｂ≦ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔ_iが成り立つとき、双方向予測符号化を選択する。
【００７３】
ここで、０＜ｕ＜ｔであり、また、ｕは、図１６におけるｋより大きい値である。図４（Ｂ）においては、ｔ＝４／３，ｕ＝４／５となっている。また、ｓ＝１となっている。
【００７４】
この場合も、予測残差ＥｆがＴ_i未満か、または予測誤差Ｅｂがｓ×Ｔ_i未満となるときは、双方向予測符号化は選択されない。即ち、この場合、双方向予測符号化は、予測残差ＥｆがＴ_i以上となるか、または予測誤差Ｅｂがｓ×Ｔ_i以上となるときに限り選択され得る。
【００７５】
従って、上述の場合と同様に、動きベクトルの大きさ｜ＭＶ｜が大きくなるにつれて、定数Ｔ_iを大きな値に設定することにより、双方向予測符号化が選択され難くなり、その結果、符号化効率を向上させることができる。
【００７６】
また、この場合、符号化対象のＢピクチャが、未来参照画像に近いことから、その未来参照画像のみを用いる逆方向予測符号化が選択され易くなっているので、この点からも、符号化効率を向上させることができる。
【００７７】
なお、画像の動きが遅い場合には、前述したように、双方向予測符号化の予測精度が高く、また、発生符号量も少なくなるので、双方向予測符号化が選択されるのが望ましい。そこで、動きベクトルの大きさ｜ＭＶ｜が所定の値ｍｖ₀未満となった場合には、例えば、図１６と同一の図５に示すように、式Ｅｂ＞ｊ×Ｅｆが成り立つときは、順方向予測符号化を選択し、式Ｅｂ＜ｋ×Ｅｆが成り立つときは、逆方向予測符号化を選択し、式ｋ×Ｅｆ≦Ｅｂ≦ｊ×Ｅｆが成り立つときは、双方向予測符号化を選択するようにする。
【００７８】
即ち、図４において、例えば、ｔ＝ｑ＝ｊ，ｒ＝ｕ＝ｋ，Ｔ_i＝０とする。
【００７９】
このようにすることで、動きベクトルの大きさ｜ＭＶ｜がｍｖ₀未満となった場合には、予測精度の高い双方向予測符号化が選択され易くなり、その結果、符号化効率を向上させることができる。
【００８０】
なお、画像の動きの速さは、動きベクトルの大きさ｜ＭＶ｜の他、例えば、動きベクトルＭＶのｘ成分の絶対値とｙ成分の絶対値との和｜ｘ｜＋｜ｙ｜などにも反映される。そこで、上述の定数Ｔ_iは、この成分の絶対値和｜ｘ｜＋｜ｙ｜に対応して設定することも可能である。
【００８１】
次に、双方向予測符号化による予測精度は、画像の動きの速さの他、その複雑さによっても変化する。即ち、双方向予測符号化による予測精度は、基本的に、画像の動きが、物体が、水平方向にパンしているなど、一定の単純なものであるときは高くなり、複雑になるほど低下する。
【００８２】
このため、双方向予測符号化による場合には、順方向動きベクトルと逆方向動きベクトルとの両方を伝送しなければならないことをも考慮すると、画像の動きが複雑な場合には、双方向予測符号化による予測残差が最も小さいときであっても、符号化対象のＢピクチャから時間的に最も近い参照画像（過去参照画像または未来参照画像までの距離が等しい場合には、そのうちのいずれか一方）のみを用いて予測符号化を行う方が、発生する全体のデータ量が少なくなることが多い。
【００８３】
一方、例えば、物体が平行移動している画像においては、その順方向動きベクトルと逆方向動きベクトルの方向は逆になる。即ち、順方向動きベクトルのｘ成分またはｙ成分の符号と、逆方向動きベクトルのｘ成分またはｙ成分の符号とは（ｘ成分どうしの符号とｙ成分どうしの符号は）、それぞれ異なるものとなる。
【００８４】
逆に、物体が複雑な動きをしている場合、ｘ成分どうしの符号またはｙ成分どうしの符号のうちの少なくとも一方は同一となる。
【００８５】
従って、例えば、いま、順方向動きベクトルのｘ成分またはｙ成分をそれぞれＦｘまたはＦｙとするとともに、逆方向動きベクトルのｘ成分またはｙ成分をそれぞれＢｘまたはＢｙとすると、次式で表されるＳＭＶは、画像の動きの複雑さを反映したものとなる。
【００８６】
ＳＭＶ＝｜Ｆｘ＋Ｂｘ｜＋｜Ｆｙ＋Ｂｙ｜
【００８７】
なお、このＳＭＶは、画像の動きの複雑さに対応して変化する他、順方向予測符号化および逆方向予測符号化の両方の予測精度が高い場合には小さくなり、いずれか一方の予測精度が低い場合には大きくなる傾向がある。
【００８８】
そこで、Ｉ／Ｐピクチャの間に、例えば、図２に示したように、２枚のＢピクチャが配置されている場合においては、ＳＭＶに対応して、次のように予測モードを設定することによっても、符号化効率を向上させることができる。
【００８９】
即ち、まず、Ｄｆ＝１およびＤｂ＝２の場合、例えば、図４（Ａ）に示したように、式Ｅｂ＞ｐ×ＥｆかつＥｂ＞ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔ_iが成り立つとき、順方向予測符号化を選択し、式Ｅｂ≦ｐ×ＥｆかつＥｂ＜ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔ_iが成り立つとき、逆方向予測符号化を選択する。また、式ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔ_i≦ＥｂかつＥｂ≦ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔ_iが成り立つとき、双方向予測符号化を選択する。
【００９０】
この場合、上述したように、予測残差ＥｆがＴ_i未満か、または予測誤差Ｅｂがｐ×Ｔ_i未満となるときは、双方向予測符号化は選択されない。即ち、この場合、双方向予測符号化は、予測残差ＥｆがＴ_i以上となるか、または予測誤差Ｅｂがｐ×Ｔ_i以上となるときに限り選択され得る。
【００９１】
従って、この場合、ＳＭＶが大きくなるにつれて、定数Ｔ_iを大きな値に設定することにより、双方向予測符号化が選択され難くなる。
【００９２】
即ち、０＜ＭＶ₀＜ＭＶ₁＜・・・＜ＭＶ_n-1＜ＭＶ_nとする場合において、ＳＭＶが、ＭＶ₀以上ＭＶ₁未満のときは、Ｔ_iをＴ₁に、ＭＶ₁以上ＭＶ₂未満のときは、Ｔ_iをＴ₂に、・・・、ＭＶ_n-1以上ＭＶ_n未満のときは、Ｔ_iをＴ_nに、ＭＶ_n以上のときは、Ｔ_iをＴ_n+1に設定する。このようにすることで、画像の動きが複雑なほど、予測精度の低下する双方向予測符号化が選択され難くなり、その結果、符号化効率を向上させることができる。
【００９３】
また、この場合、符号化対象のＢピクチャが、過去参照画像に近いことから、その過去参照画像のみを用いる順方向予測符号化が選択され易くなっているので、この点からも、符号化効率を向上させることができる。
【００９４】
一方、Ｄｆ＝２およびＤｂ＝１の場合、例えば、図４（Ｂ）に示したように、式Ｅｂ＞ｓ×ＥｆかつＥｂ＞ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔ_iが成り立つとき、順方向予測符号化を選択し、式Ｅｂ≦ｓ×ＥｆかつＥｂ＜ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔ_iが成り立つとき、逆方向予測符号化を選択する。また、式ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔ_i≦ＥｂかつＥｂ≦ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔ_iが成り立つとき、双方向予測符号化を選択する。
【００９５】
この場合も、予測残差ＥｆがＴ_i未満か、または予測誤差Ｅｂがｓ×Ｔ_i未満となるときは、双方向予測符号化は選択されない。即ち、この場合、双方向予測符号化は、予測残差ＥｆがＴ_i以上となるか、または予測誤差Ｅｂがｓ×Ｔ_i以上となるときに限り選択され得る。
【００９６】
従って、やはり、上述の場合と同様に、ＳＭＶが大きくなるにつれて、定数Ｔ_iを大きな値に設定することにより、双方向予測符号化が選択され難くなり、その結果、符号化効率を向上させることができる。
【００９７】
また、この場合、符号化対象のＢピクチャが、未来参照画像に近いことから、その未来参照画像のみを用いる逆方向予測符号化が選択され易くなっているので、この点からも、符号化効率を向上させることができる。
【００９８】
なお、画像の動きが非常に単純な場合、即ち、例えば、物体が、一定方向に平行移動しているような場合には、ＳＭＶは非常に小さな値となる（理想的には、０となる）。また、この場合、前述したように、双方向予測符号化の予測精度が高く、また、発生符号量も少なくなるので、双方向予測符号化が選択されるのが望ましい。そこで、ＳＭＶが所定の値ＭＶ₀未満となった場合には、例えば、図１６と同一の図５に示すように、式Ｅｂ＞ｊ×Ｅｆが成り立つときは、順方向予測符号化を選択し、式Ｅｂ＜ｋ×Ｅｆが成り立つときは、逆方向予測符号化を選択し、式ｋ×Ｅｆ≦Ｅｂ≦ｊ×Ｅｆが成り立つときは、双方向予測符号化を選択するようにする。
【００９９】
即ち、図４において、例えば、ｔ＝ｑ＝ｊ，ｒ＝ｕ＝ｋ，Ｔ_i＝０とする。
【０１００】
このようにすることで、ＳＭＶがＭＶ₀未満となった場合には、予測精度の高い双方向予測符号化が選択され易くなり、その結果、符号化効率を向上させることができる。
【０１０１】
また、画像の動きが非常に単純な場合の例として、ビデオカメラをパンして撮影した画像があるが、この場合、動きベクトルのｘ成分が、そのｙ成分に比較して非常に大きくなる。そこで、例えば、ｇを所定の定数（１より大きい値である、例えば４など）として、式｜ｘ｜＞ｇ｜ｙ｜が成り立つときにも、上述のように、双方向予測符号化が選択され易くするようにすることが可能である。なお、このことは、式ｇ｜ｘ｜＜｜ｙ｜が成り立つときについても同様である。
【０１０２】
以上のように、画像の動きの速さや複雑さに対応して、適応的に、予測モードを選択（決定）するようにすることで、符号化効率を、従来より向上させることができる。
【０１０３】
なお、上述の場合においては、Ｉ／Ｐピクチャの間に、２枚のＢピクチャが配置されているとしたが、その間に、１枚だけまたは３枚以上のＢピクチャが配置されている場合についても同様のことがいえる。
【０１０４】
次に、図６および図７は、本発明を適用した画像符号化装置の一実施の形態の構成を示している。
【０１０５】
この画像符号化装置は、上述した、例えば、画像の動きの複雑さを反映するＳＭＶに対応して予測モードを決定し、画像を、動き補償とＤＣＴ（Discrete Cosine Transform）とを組み合わせたハイブリッド符号化するようになされている。
【０１０６】
即ち、符号化すべき画像データは、例えば、フレーム（またはフィールド）単位で、画像符号化タイプ指定回路３に供給される。画像符号化タイプ指定回路３は、そこに入力されるフレームを、Ｉ，Ｐ、またはＢピクチャ（以下、適宜、これらをまとめてピクチャタイプという）のいずれとして処理するのかを指定する。
【０１０７】
具体的には、画像符号化タイプ指定回路３は、例えば、図８（Ａ）に示すように、そこに入力される１６フレームの画像Ｆ１乃至Ｆ１６を１ＧＯＰのデータとして処理し、同図（Ｂ）に示すように、最初のフレームＦ１をＩピクチャとして、２番目および３番目のフレームＦ２およびＦ３をＢピクチャとして、４番目のフレームＦ４をＰピクチャとして指定する。さらに、画像符号化タイプ指定回路３は、５番目および６番目のフレームＦ５およびＦ６をＢピクチャとして、７番目のフレームＦ７をＰピクチャとして指定し、以下、同様にして、残りのフレームＦ８乃至Ｆ１６を、ＢまたはＰピクチャとして指定する。
【０１０８】
なお、図８（Ｂ）（同図（Ｃ）についても同様）において、Ｉ，Ｐ，Ｂに付してある下付けの数字は、ＭＰＥＧにおけるテンポラルリファレンス（temporal referencd）に相当し、各フレームの表示順を表す。
【０１０９】
画像符号化タイプ指定回路３においてピクチャタイプの指定されたフレームは、画像符号化順序替え回路４に出力される。画像符号化順序替え回路４では、フレームの並びが符号化順に並び替えられる。即ち、Ｂピクチャは、受信側において、自己が表示された後に表示される画像を参照画像（未来参照画像）として用いて復号化される場合があるため、その未来参照画像が既に復号化されていないと、Ｂピクチャを復号化することができない。そこで、画像符号化順序替え回路４では、未来参照画像となるフレームが、Ｂピクチャより先に符号化されるように、ＧＯＰを構成するフレームの並びが替えられる。
【０１１０】
具体的には、例えば、図８（Ｃ）に示すように並び替えられる。
【０１１１】
画像符号化順序替え回路４で並びの替えられたフレームのシーケンスは、スキャンコンバータ５に供給される。スキャンコンバータ５では、ラスタスキャンで入力されるフレームがブロックフォーマットの信号に変換される。
【０１１２】
即ち、スキャンコンバータ５には、例えば、Ｈドットで構成されるラインを、Ｖラインだけ集めたフレームフォーマットの画像データが入力される。そして、スキャンコンバータ５は、この画像データを、図９（Ａ）に示すように、１６ラインで構成されるＮ個のスライスに区分し（従って、ここでは、Ｖ＝１６×Ｎ）、さらに、同図（Ｂ）に示すように、各スライスを、１６ドットごとに区分することで、Ｍ個のマクロブロックに分割する（従って、ここでは、Ｈ＝１６×Ｍ）。
【０１１３】
従って、各マクロブロックは、１６×１６ドットに対応する輝度信号で構成される。なお、マクロブロックは、図９（Ｃ）に示すように、８×８ドットに対応する輝度信号Ｙ［１］乃至Ｙ［４］に区分され、さらに、マクロブロックには、８×８ドットに対応する色差信号Ｃｂ［５］とＣｒ［６］が対応付けられる。後述するＤＣＴ回路１２（図７）では、この８×８ドットのブロック単位で、ＤＣＴ処理が施される。
【０１１４】
以上のようにして、スキャンコンバータ５で得られたマクロブロックは、図７の演算部１１に供給される。
【０１１５】
図６に戻り、カウンタ９は、画像符号化順序替え回路４が出力するフレーム同期信号をカウントしている。
【０１１６】
即ち、画像符号化順序替え回路４は、スキャンコンバータ５に、並び替えたフレームを出力するタイミングで、フレーム同期信号を、カウンタ９に出力している。さらに、画像符号化順序替え回路４は、スキャンコンバータ５に出力するフレームのピクチャタイプＴＹＰＥを検出し、動きベクトル推定回路６、カウンタ９、および図７の予測モード決定回路２１に出力している。
【０１１７】
カウンタ９は、画像符号化順序替え回路４が出力するフレーム同期信号をカウントし、そのカウント値ＣＮＴを、画像間距離発生回路１０に出力する。なお、カウンタ９は、画像符号化順序替え回路４が出力するピクチャタイプＴＹＰＥがＩまたはＰピクチャのとき、そのカウント値ＣＮＴを、例えば０にリセットするようになされている。
【０１１８】
従って、カウンタ９が出力するカウント値ＣＮＴは、ＩまたはＰピクチャの間に配置されたＢピクチャの数を表す。
【０１１９】
ここで、本実施の形態では、図８（Ｂ）に示したように、ＩまたはＰピクチャの間に、２枚のＢピクチャが配置されているので、カウンタ９が出力するカウント値ＣＮＴは、同図（Ｄ）に示すように、０，１、または２となる。
【０１２０】
画像間距離発生回路１０は、カウンタ９からのカウント値ＣＮＴに基づいて、Ｂピクチャから、その予測符号化（インター符号化）に用いられる過去参照画像または未来参照画像それぞれまでの距離（フレーム数）ＤｆまたはＤｂを算出し、図７の予測モード決定回路２１に出力する。
【０１２１】
即ち、画像間距離発生回路１０は、過去参照画像までの距離Ｄｆとして、図８（Ｅ）に示すように、カウント値ＣＮＴと同一の値を出力し、また、未来参照画像までの距離Ｄｂとして、図８（Ｆ）に示すように、カウント値ＣＮＴを逆に並べた値を出力する。
【０１２２】
一方、動きベクトル推定回路６では、順方向動きベクトルＭＶｆおよび逆方向動きベクトルＭＢｂが検出（推定）され、さらに、その順方向動きベクトルＭＶｆまたは逆方向動きベクトルＭＶｂそれぞれに対する予測残差（動きベクトル推定残差）ＥｆまたはＥｂが算出される。
【０１２３】
即ち、動きベクトル推定回路６には、画像符号化順序替え回路４から、ピクチャタイプＴＹＰＥが指定されたフレームと、そのピクチャタイプＴＹＰＥが供給されるようになされている。
【０１２４】
動きベクトル推定回路６は、画像符号化順序替え回路４から供給されるフレームを、そのピクチャタイプＴＹＰＥにしたがって、記憶部７を構成する過去参照画像記憶部７Ａ、現在画像記憶部７Ｂ、または未来参照画像記憶部７Ｃのうちのいずれかに記憶させ、現在画像記憶部７Ｂに記憶された画像を対象に、その動きベクトルを検出する。
【０１２５】
具体的には、動きベクトル推定回路６は、例えば、図８に示した場合において、Ｉ₁を過去参照画像記憶部７Ａに記憶させ、Ｐ₄を現在画像記憶部７Ｂに記憶させ、これにより、Ｉ₁を過去参照画像として、Ｐ₄の動きベクトル（順方向動きベクトル）ＭＶｆを検出し、その予測残差Ｅｆを求める。次に、現在画像記憶部７Ｂに記憶されていたＰ₄を未来参照画像記憶部７Ｃに転送し、Ｂ₂を現在画像記憶部７Ｂに記憶させ、これにより、Ｉ₁またはＰ₄を、それぞれ過去参照画像または未来参照画像として、Ｂ₂の順方向動きベクトルＭＶｆまたは逆方向動きベクトルＭＶｂを検出し、それぞれの予測残差ＥｆまたはＥｂを求める。
【０１２６】
続いて、Ｂ₃を現在画像記憶部７Ｂに記憶させ、これにより、上述した場合と同様に、Ｂ₃の順方向動きベクトルＭＶｆまたは逆方向動きベクトルＭＶｂを検出し、それぞれの予測残差ＥｆまたはＥｂを求める。
【０１２７】
その後、未来参照画像記憶部７Ｃに記憶されていたＰ₄を、過去参照画像記憶部７Ａに転送して記憶させる（上書きする）とともに、Ｐ₇を現在画像記憶部７Ｂに記憶させ、これにより、Ｐ₄を過去参照画像として、Ｐ₇の動きベクトルＭＶｆを検出し、その予測残差Ｅｆを求める。
【０１２８】
次に、現在画像記憶部７Ｂに記憶されていたＰ₇を未来参照画像記憶部７Ｃに転送し、Ｂ₅を現在画像記憶部７Ｂに記憶させ、これにより、Ｐ₄またはＰ₇を、それぞれ過去参照画像または未来参照画像として、Ｂ₅の順方向動きベクトルＭＶｆまたは逆方向動きベクトルＭＶｂを検出し、それぞれの予測残差ＥｆまたはＥｂを求める。以下、同様にして、動きベクトルの検出と、予測残差の算出が行われていく。
【０１２９】
ここで、予測誤差ＥｆおよびＥｂの算出方法について説明する。
【０１３０】
いま、あるマクロブロックを注目マクロブロックとし、その注目マクロブロックを構成する左からｉ番目で、上からｊ番目の画素の画素値をＡ_ijと表すとともに、注目マクロブロックに最も近似する過去参照画像の１６×１６の範囲を構成する、左からｉ番目で、上からｊ番目の画素の画素値をＦ_ijと表す。この場合、予測誤差Ｅｆは、例えば、次式にしたがって算出される。
【０１３１】
Ｅｆ＝Σ｜Ａ_ij−Ｆ_ij｜
なお、上式において、Σは、ｉ，ｊを１乃至１６に変えてのサメーションを表す。
【０１３２】
また、注目マクロブロックに最も近似する未来参照画像の１６×１６の範囲を構成する、左からｉ番目で、上からｊ番目の画素の画素値をＢ_ijと表すと、予測誤差Ｅｂは、例えば、次式にしたがって算出される。
【０１３３】
Ｅｂ＝Σ｜Ａ_ij−Ｂ_ij｜
なお、上式においても、Σは、ｉ，ｊを１乃至１６に変えてのサメーションを表す。
【０１３４】
以上のようにして求められた動きベクトルＭＶｆ，ＭＢｂ、予測誤差Ｅｆ，Ｅｂは、図７の予測モード決定回路２１に供給される。また、動きベクトルＭＶｆ，ＭＢｂは、図７の可変長符号化回路１５および動き補償回路２０にも供給される。さらに、Ｂピクチャについての動きベクトルＭＶｆ，ＭＢｂは、動き量算出回路８にも供給される。
【０１３５】
動き量算出回路８では、動きベクトルＭＶｆ，ＭＢｂから、上述したＳＭＶが算出され、図７の予測モード決定回路２１に供給される。
【０１３６】
図７の予測モード決定回路２１では、距離Ｄｆ，Ｄｂ、動きベクトルＭＶｆ，ＭＶｂ、ピクチャタイプＴＹＰＥ、およびＳＭＶに基づいて、マクロブロックの予測モードが決定される。
【０１３７】
即ち、ピクチャタイプＴＹＰＥがＩピクチャである場合、即ち、符号化対象のマクロブロックがＩピクチャである場合、予測モード決定回路２１は、予測モードを、イントラ符号化モードに決定する。
【０１３８】
また、ピクチャタイプＴＹＰＥがＰピクチャである場合、即ち、符号化対象のマクロブロックがＰピクチャである場合、予測モード決定回路２１は、次のようにして、予測モードを、イントラ符号化モードまたは順方向予測符号化モードのうちのいずれかに決定する。
【０１３９】
即ち、この場合、予測モード決定回路２１は、まず、イントラ符号化時の予測残差として、例えば、次式で定義されるＥ_intraを算出する。
【０１４０】
Ｅ_intra＝Σ｜Ａ_ij−Ａ_av｜
なお、上式において、Ａ_ijは、符号化対象のマクロブロックを構成する左からｉ番目で、上からｊ番目の画素の画素値を表し、Ａ_avは、その平均値を表す。また、Σは、ｉ，ｊを１乃至１６に変えてのサメーションを表す。
【０１４１】
そして、予測モード決定回路２１は、イントラ符号化時の予測残差Ｅ_intraが、順方向予測符号化における予測残差Ｅｆより小さいとき（以下のとき）、予測モードを、イントラ符号化モードに決定する。また、イントラ符号化時の予測残差Ｅ_intraが、順方向予測符号化における予測残差Ｅｆ以上のとき（より大きいとき）、予測モードを、順方向予測符号化モードに決定する。
【０１４２】
次に、ピクチャタイプＴＹＰＥがＢピクチャである場合、即ち、符号化対象のマクロブロックがＢピクチャである場合、予測モード決定回路２１は、次のようにして、予測モードを、イントラ符号化モード、順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのうちのいずれかに決定する。
【０１４３】
即ち、まず、予測モード決定回路２１は、インター符号化、つまり、順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのうちの１つを選択（決定）する。
【０１４４】
この選択は、ＳＭＶ、予測残差Ｅｆ，Ｅｂ、距離Ｄｆ，Ｄｂ、および動きベクトルＭＶｆ，ＭＶｂに基づいて行われる。
【０１４５】
即ち、まず、ＳＭＶに対応して、図４で説明した定数Ｔ_iが設定される。そして、距離ＤｆおよびＤｂに対応して、図４（Ａ）または図４（Ｂ）のうちのいずれか一方が選択され、その選択された方において、予測残差ＥｆとＥｂとの上述したような大小関係に基づいて、順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードの中から１つが選択される。
【０１４６】
なお、ＳＭＶが所定値ＭＶ₀以下の場合や、動きベクトルＭＶｆ，ＭＶｂのｘ成分またはｙ成分の絶対値のうちのいずれか一方が、他方に比較して充分大きい場合などには、上述したように、図５で説明した予測残差ＥｆとＥｂとの大小関係に基づいて、順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードの中から１つが選択される。
【０１４７】
そして、インター符号化の中から選択された予測モードに対応する予測残差が、インター符号化についての予測残差Ｅ_interとされる。なお、双方向予測符号化モードが選択された場合、予測残差Ｅ_interは、例えば、予測残差ＥｆとＥｂの平均値とされる。従って、順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードが選択された場合、予測残差Ｅ_interは、それぞれＥｆ，Ｅｂ、または（Ｅｆ＋Ｅｂ）／２とされる。
【０１４８】
さらに、予測モード決定回路２１では、上述した場合と同様にして、イントラ符号化時の予測残差Ｅ_intraが算出される。そして、予測モード決定回路２１は、イントラ符号化時の予測残差Ｅ_intraが、インター符号化の中から選択したものの予測残差Ｅ_interより小さいとき、予測モードを、イントラ符号化モードに決定する。また、イントラ符号化時の予測残差Ｅ_intraが、予測残差Ｅ_inter以上のとき、予測モードを、インター符号化の中から選択したものに決定する。
【０１４９】
従って、Ｂピクチャについては、画像の動き複雑さ、さらには、参照画像までの距離に対応して、その予測モードが適応的に決定されるので、その符号化効率を、より向上させることが可能となる。
【０１５０】
以上のようにして決定された予測モードは、予測モード決定部２１から、演算部１１、可変長符号化回路１５、および動き補償回路２０に供給される。
【０１５１】
演算部１１には、予測モード決定回路２１から供給される予測モードで予測符号化すべきマクロブロック（符号化対象のマクロブロック）が、図６のスキャンコンバータ５から供給される。演算部１１は、演算器１１Ａ乃至１１ＣおよびスイッチＳＷを有しており、予測モードに対応して、スイッチＳＷが切り換えられる。
【０１５２】
即ち、演算部１１にＩピクチャのマクロブロックが入力される場合においては、予測モードはイントラ符号化モードとなっている。この場合、スイッチＳＷは端子ａを選択する。端子ａには、符号化対象のマクロブロックが、そのまま供給されるようになされており、従って、このマクロブロックは、端子ａを介して、ＤＣＴ回路１２に供給される。
【０１５３】
ＤＣＴ回路１２では、演算部１１からのマクロブロックがＤＣＴ処理され、これにより、ＤＣＴ係数に変換される。このＤＣＴ係数は、量子化回路１３に供給され、そこで、所定の量子化ステップで量子化された後、可変長符号化回路１５に供給される。
【０１５４】
可変長符号化回路１５には、量子化回路１３から量子化されたＤＣＴ係数が供給される他、同じく量子化回路１３から量子化ステップが、予測モード決定回路２１から予測モードが、図６の動きベクトル推定回路６から動きベクトルＭＶｆ，ＭＶｂが、それぞれ供給されるようになされている。可変長符号化回路１５は、適宜、これらのデータを、例えば、ハフマン符号などの可変長符号に変換し、送信バッファ１４に出力する。
【０１５５】
送信バッファ１４は、可変長符号化回路１５からの可変長符号を一時記憶し、例えば、一定のデータレートにして出力する。送信バッファ１４から出力される可変長符号は、例えば、光ディスクや、光磁気ディスク、磁気ディスク、光カード、磁気テープ、相変化ディスクなどの記録媒体３１に記録され、あるいは、衛星回線、地上波、ＣＡＴＶ網、インターネットなどの伝送路３２を介して伝送される。
【０１５６】
なお、送信バッファ１４は、そのデータの蓄積量を量子化回路１３に供給（フィードバック）するようになされている。量子化回路１３は、この蓄積量に基づいて、量子化ステップを設定するようになされている。即ち、量子化回路１３は、送信バッファ１４がオーバーフローしそうなとき、量子化ステップを大きくし、これにより、データ発生量を減少させる。また、量子化回路１３は、送信バッファ１４がアンダーフローしそうなとき、量子化ステップを小さくし、これにより、データ発生量を増加させる。以上のようにして、送信バッファ１４のオーバーフローおよびアンダーフローを防止するようになされている。
【０１５７】
一方、量子化回路１３が出力する量子化されたＤＣＴ係数と量子化ステップとは、可変長符号化回路１５の他、逆量子化回路１６にも供給される。逆量子化回路１６は、量子化回路１３からの量子化されたＤＣＴ係数を、同じく量子化回路１３からの量子化ステップで逆量子化し、その結果得られるＤＣＴ係数を、ＩＤＣＴ回路１７に出力する。
【０１５８】
ＩＤＣＴ回路１７では、逆量子化回路１６からのＤＣＴ係数が逆ＤＣＴ処理され、これにより、演算部１１の出力とほぼ同一の値の画像データが復元され、演算器１８に供給される。演算器１８は、そこに入力される画像データが、イントラ符号化されるものである場合には、特に処理を行わず、その画像データを、そのままフレームメモリ１９に出力して記憶させる。
【０１５９】
なお、フレームメモリ１９は、未来参照画像または過去参照画像として用いられる画像を記憶する未来参照画像記憶回路１９Ａおよび過去参照画像記憶回路１９Ｂを有しており、最初に符号化され、復号化されたＩピクチャは、過去参照画像記憶回路１９Ｂに記憶される。
【０１６０】
次に、演算部１１に入力されたマクロブロックがＰピクチャである場合において、予測モードがイントラ符号化モードであるときには、スイッチＳＷは端子ａを選択する。従って、この場合、Ｐピクチャのマクロブロックは、上述のＩピクチャにおける場合と同様に符号化され、また、ローカルデコードされて、フレームメモリ１９に供給される。なお、Ｉピクチャの次に符号化され、復号化されたＰピクチャは、未来参照画像記憶回路１９Ａに記憶される。
【０１６１】
一方、演算部１１に入力されたマクロブロックがＰピクチャである場合において、予測モードが順方向予測符号化モードであるときには、スイッチＳＷは、端子ｂを選択する。端子ｂには、演算器１１Ａの出力が供給されるようになされており、また、演算器１１Ａには、符号化対象のマクロブロックと、動き補償回路２０の出力とが供給されるようになされている。
【０１６２】
動き補償回路２０は、予測モードが順方向予測符号化モードの場合、過去参照画像記憶回路１９Ｂに記憶されている画像（いまの場合、Ｉピクチャ）を過去参照画像として読み出し、動きベクトルＭＶｆにしたがって動き補償を施すことにより予測画像を生成する。即ち、動き補償回路２０は、符号化対象のマクロブロックに対応する位置から、動きベクトルＭＶｆに対応する分だけずらしたアドレスのデータを、過去参照画像記憶回路１９Ｂから読み出し、これを予測画像として演算器１１Ａに供給する。
【０１６３】
演算器１１Ａは、符号化対象のマクロブロックを構成する各画素値から、予測画像を構成する、対応する画素値を減算し、その減算値（差分値）を出力する。従って、この場合、演算部１１からは、符号化対象のマクロブロックと、過去参照画像から得られた予測画像との差分値が、ＤＣＴ回路１２に供給される。この差分値は、イントラ符号化における場合と同様に符号化されて出力される。
【０１６４】
さらに、この差分値は、上述した場合と同様に、ＤＣＴ回路１２、量子化回路１３、逆量子化回路１６、およびＩＤＣＴ回路１７を介することで、元の値とほぼ同一の値に復元され、演算器１８に供給される。
【０１６５】
この場合、演算器１８には、動き補償回路２０から、演算器１１Ａに供給される予測画像と同一のデータが供給されており、演算器１８では、復元された差分値と、その予測画像とが加算され、これにより、Ｐピクチャがローカルデコードされる。このローカルデコードされたＰピクチャは、フレームメモリ１９に供給されて記憶される。
【０１６６】
なお、Ｉピクチャの次に符号化され、復号化されたＰピクチャは、上述したように、未来参照画像記憶回路１９Ａに記憶される。
【０１６７】
次に、演算部１１に入力されたマクロブロックがＢピクチャである場合において、予測モードがイントラ符号化モードまたは順方向予測符号化モードであるときには、スイッチＳＷは端子ａまたはｂをそれぞれ選択する。従って、この場合、Ｂピクチャのマクロブロックは、上述した場合と同様に符号化される。
【０１６８】
一方、演算部１１に入力されたマクロブロックがＢピクチャである場合において、予測モードが逆方向予測符号化モードであるときには、スイッチＳＷは、端子ｃを選択する。端子ｃには、演算器１１Ｂの出力が供給されるようになされており、また、演算器１１Ｂには、符号化対象のマクロブロックと、動き補償回路２０の出力とが供給されるようになされている。
【０１６９】
動き補償回路２０は、予測モードが逆方向予測符号化モードの場合、未来参照画像記憶回路１９Ａに記憶されている画像（いまの場合、Ｐピクチャ）を未来参照画像として読み出し、動きベクトルＭＶｂにしたがって動き補償を施すことにより予測画像を生成する。即ち、動き補償回路２０は、符号化対象のマクロブロックに対応する位置から、動きベクトルＭＶｂに対応する分だけずらしたアドレスのデータを、未来参照画像記憶回路１９Ａから読み出し、これを予測画像として演算器１１Ｂに供給する。
【０１７０】
演算器１１Ｂは、符号化対象のマクロブロックを構成する各画素値から、予測画像を構成する、対応する画素値を減算し、その減算値（差分値）を出力する。従って、この場合、演算部１１からは、符号化対象のマクロブロックと、未来参照画像から得られた予測画像との差分値が、ＤＣＴ回路１２に供給される。この差分値は、イントラ符号化における場合と同様に符号化されて出力される。
【０１７１】
また、演算部１１に入力されたマクロブロックがＢピクチャである場合において、予測モードが双方向予測符号化モードであるときには、スイッチＳＷは、端子ｄを選択する。端子ｄには、演算器１１Ｃの出力が供給されるようになされており、また、演算器１１Ｃには、符号化対象のマクロブロックと、動き補償回路２０の出力とが供給されるようになされている。
【０１７２】
動き補償回路２０は、予測モードが双方向予測符号化モードの場合、過去参照画像記憶回路１９Ｂに記憶されている画像（いまの場合、Ｉピクチャ）を過去参照画像として読み出し、動きベクトルＭＶｆにしたがって動き補償を施すことにより予測画像（以下、適宜、過去予測画像という）を生成するとともに、未来参照画像記憶回路１９Ａに記憶されている画像（いまの場合、Ｐピクチャ）を未来参照画像として読み出し、動きベクトルＭＶｂにしたがって動き補償を施すことにより予測画像（以下、適宜、未来予測画像という）を生成する。この過去予測画像および未来予測画像は、演算器１１Ｃに供給される。
【０１７３】
演算器１１Ｃは、まず、動き補償回路２０より供給される過去予測画像および未来予測画像の、例えば平均値（以下、適宜、平均予測画像という）を演算する。そして、演算器１１Ｃは、符号化対象のマクロブロックを構成する各画素値から、平均予測画像を構成する、対応する画素値を減算し、その減算値（差分値）を出力する。従って、この場合、演算部１１からは、符号化対象のマクロブロックと、平均予測画像との差分値が、ＤＣＴ回路１２に供給される。この差分値は、イントラ符号化における場合と同様に符号化されて出力される。
【０１７４】
なお、本実施の形態においては、Ｂピクチャは、他の画像を符号化する際に、参照画像として用いられないため、ローカルデコードされない（する必要がない）。また、過去参照画像記憶回路１９Ａおよび未来参照画像記憶回路１９Ｂは、必要に応じてバンク切り換えすることができるようになされており、これにより、過去参照画像記憶回路１９Ａおよび未来参照画像記憶回路１９Ｂに記憶されている画像データを、過去参照画像および未来参照画像のいずれとしても用いることができるようになされている。さらに、上述の処理は、輝度信号Ｙおよび色差信号Ｃｂ，Ｃｒのすべてに施される。但し、色差信号Ｃｂ，Ｃｒについては、例えば、輝度信号Ｙを処理するときに用いた動きベクトルの大きさを１／２にしたものが、その動きベクトルとして用いられる。
【０１７５】
次に、図１０のフローチャートを参照して、図７の予測モード決定回路２１の処理（予測モード決定処理）について、さらに説明する。
【０１７６】
予測モード決定回路２１では、図１０のフローチャートにしたがった処理が、マクロブロックごとに行われる。
【０１７７】
即ち、予測モード決定回路２１では、まず最初に、ステップＳ１において、ＳＭＶが、閾値ＭＶ₀未満かどうかが判定される。ステップＳ１において、ＳＭＶが、閾値ＭＶ₀未満であると判定された場合、ステップＳ２に進み、以下、図５で説明したようにして、インター符号化の中の１つが選択される。
【０１７８】
即ち、ステップＳ２では、予測残差Ｅｂが、予測残差Ｅｆのｊ倍（ｊ×Ｅｆ）より大きいかどうかが判定される。ステップＳ２において、Ｅｂがｊ×Ｅｆより大きいと判定された場合、ステップＳ３に進み、インター符号化として、順方向予測符号化が選択され、処理を終了する。
【０１７９】
その後は、上述したように、選択されたインター符号化についての予測残差と、イントラ符号化についての予測残差との大小関係に基づいて、最終的な予測モードが決定される。
【０１８０】
一方、ステップＳ２において、Ｅｂがｊ×Ｅｆより大きくないと判定された場合、ステップＳ４に進み、予測残差Ｅｂが、予測残差Ｅｆのｋ倍（ｋ×Ｅｆ）未満であるかどうかが判定される。ステップＳ４において、Ｅｂがｋ×Ｅｆ未満であると判定された場合、ステップＳ５に進み、インター符号化として、逆方向予測符号化が選択され、処理を終了する。
【０１８１】
また、ステップＳ４において、Ｅｂがｋ×Ｅｆ未満でないと判定された場合、即ち、Ｅｂが、ｋ×Ｅｆ以上かつｊ×Ｅｆ以下である場合、ステップＳ６に進み、インター符号化として、双方向予測符号化が選択され、処理を終了する。
【０１８２】
なお、予測モード決定回路２１は、ステップＳ１の処理を行う前に、順方向動きベクトルＭＶｆまたは逆方向動きベクトルＭＶｂのｘ成分およびｙ成分について、例えば、式｜ｘ｜＞ｇ｜ｙ｜または｜ｙ｜＞ｇ｜ｘ｜が成り立つかどうかを判定し、成り立つ場合には、ＳＭＶを、０などのＭＶ₀未満の値に、強制的に設定するようになされている。従って、例えば、物体が、ほぼ水平または垂直方向に移動しているような画像については、図５で説明したように、双方向予測符号化が選択され易い条件の下で、インター符号化の選択が行われる。
【０１８３】
一方、ステップＳ１において、ＳＭＶがＭＶ₀未満でないと判定された場合、ステップＳ７₁に進み、以下、図４で説明したようにして、インター符号化の選択が行われる。
【０１８４】
即ち、ステップＳ７₁では、ＳＭＶが、ＭＶ₀以上ＭＶ₁未満であるかどうかが判定される。ステップＳ７₁において、ＳＭＶが、ＭＶ₀以上ＭＶ₁未満であると判定された場合、ステップＳ８₁に進み、定数ＴｉがＴ₁に設定され、ステップＳ９に進む。
【０１８５】
また、ステップＳ７₁において、ＳＭＶが、ＭＶ₀以上ＭＶ₁未満でないと判定された場合、ステップＳ７₂に進み、ＳＭＶが、ＭＶ₁以上ＭＶ₂未満であるかどうかが判定される。
【０１８６】
以下、同様に、ステップＳ７_cでは、ＳＭＶが、ＭＶ_c-1以上ＭＶ_c未満であるかどうかが判定され、ＳＭＶが、ＭＶ_c-1以上ＭＶ_c未満である場合には、ステップＳ８_cに進み、定数ＴｉがＴ_cに設定され、ステップＳ９に進む。また、ＳＭＶが、ＭＶ_c-1以上ＭＶ_c未満でない場合には、ステップＳ７_c+1に進む。
【０１８７】
そして、ステップＳ７_nにおいて、ＳＭＶが、ＭＶ_n-1以上ＭＶ_n未満でないと判定された場合、即ち、ＳＭＶがＭＶ_n以上の場合、ステップＳ８_n+1に進み、定数ＴｉがＴ_n+1に設定され、ステップＳ９に進む。
【０１８８】
ステップＳ９では、距離Ｄｆ，Ｄｂに対応した画像間距離判定処理が行われ、処理を終了する。
【０１８９】
次に、図１１のフローチャートは、図１０のステップＳ９における画像間距離判定処理の詳細を示している。なお、図１１においては、ＩまたはＰピクチャの間に、１または２枚のＢピクチャが配置されていることを前提としている。
【０１９０】
画像間距離判定処理では、まず最初に、ステップＳ１１において、Ｄｆが１で、かつＤｂが２であるかどうかが判定される。ステップＳ１１において、Ｄｆが１で、かつＤｂが２であるかと判定された場合、ステップＳ１２に進み、以下、図４（Ａ）で説明したようにして、インター符号化が選択される。
【０１９１】
即ち、ステップＳ１２では、Ｅｂが、ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔｉより大きく、かつｐ×Ｅｆより大きいかどうかが判定される。ステップＳ１２において、Ｅｂが、ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔｉより大きく、かつｐ×Ｅｆより大きいと判定された場合、ステップＳ１３に進み、順方向予測符号化が選択され、リターンする。また、ステップＳ１２において、Ｅｂが、ｑ×Ｅｆ＋（１−ｐ×ｑ）×Ｔｉより大きくないか、またはｐ×Ｅｆより大きくないと判定された場合、ステップＳ１４に進み、Ｅｂが、ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔｉ未満で、かつｐ×Ｅｆ未満であるかどうかが判定される。
【０１９２】
ステップＳ１４において、Ｅｂが、ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔｉ未満で、かつｐ×Ｅｆ未満であると判定された場合、ステップＳ１５に進み、逆方向予測符号化が選択され、リターンする。また、ステップＳ１４において、Ｅｂが、ｒ×Ｅｆ＋（１−ｐ×ｒ）×Ｔｉ未満でないか、またはｐ×Ｅｆ未満でないと判定された場合、ステップＳ１６に進み、双方向予測符号化が選択され、リターンする。
【０１９３】
一方、ステップＳ１１において、Ｄｆが１でないか、またはＤｂが２でないと判定された場合、ステップＳ１７に進み、Ｄｆが２で、かつＤｂが１であるかどうかが判定される。
【０１９４】
ステップＳ１７において、Ｄｆが２で、かつＤｂが１であると判定された場合、ステップＳ１８に進み、以下、図４（Ｂ）で説明したようにして、インター符号化が選択される。
【０１９５】
即ち、ステップＳ１８では、Ｅｂが、ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔｉより大きく、かつｓ×Ｅｆより大きいかどうかが判定される。ステップＳ１８において、Ｅｂが、ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔｉより大きく、かつｓ×Ｅｆより大きいと判定された場合、ステップＳ１９に進み、順方向予測符号化が選択され、リターンする。また、ステップＳ１８において、Ｅｂが、ｔ×Ｅｆ＋（１−ｓ×ｔ）×Ｔｉより大きくないか、またはｓ×Ｅｆより大きくないと判定された場合、ステップＳ２０に進み、Ｅｂが、ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔｉ未満で、かつｓ×Ｅｆ未満であるかどうかが判定される。
【０１９６】
ステップＳ２０において、Ｅｂが、ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔｉ未満で、かつｓ×Ｅｆ未満であると判定された場合、ステップＳ２１に進み、逆方向予測符号化が選択され、リターンする。また、ステップＳ２０において、Ｅｂが、ｕ×Ｅｆ＋（１−ｓ×ｕ）×Ｔｉ未満でないか、またはｓ×Ｅｆ未満でないと判定された場合、ステップＳ２２に進み、双方向予測符号化が選択され、リターンする。
【０１９７】
一方、ステップＳ１７において、Ｄｆが２でないか、またはＤｂが１でないと判定された場合、ステップＳ２３に進み、以下、図５で説明したようにして、インター符号化の中の１つが選択される。即ち、ステップＳ２３乃至Ｓ２７において、図１０のステップＳ２乃至Ｓ６における場合とそれぞれ同様の処理が行われ、これにより、インター符号化の選択が行われる。
【０１９８】
以上のように、画像の動きの複雑さを表すＳＭＶに対応して、予測モードを決定するようにしたので、その符号化効率を、従来より向上させることが可能となる。
【０１９９】
即ち、画像の動きが複雑な場合は、予測精度、さらには、動きベクトルの伝送に必要なデータ量を考慮して、双方向予測符号化モードが選択され難くし、その逆に、画像の動きが単純な場合は、双方向予測符号化モードが選択され易くしたので、効率的な符号化を行うことが可能となる。
【０２００】
なお、画像の動きの複雑さの他、上述したように、画像の動きの速さ、さらには、その両方などに対応して、予測モードを決定するようにすることなども可能である。
【０２０１】
また、本実施の形態では、画像の動きの複雑さを、上述のＳＭＶで表すようにしたが、その他の物理量によって表すようにすることも可能である。
【０２０２】
さらに、本実施の形態においては、画像の動きの速さを、動きベクトルの大きさや、そのｘ成分およびｙ成分の絶対値の和によって表現するようにしたが、やはり、その他の物理量により表すことも可能である。
【０２０３】
また、本実施の形態では、双方向予測符号化モードを選択され易くする場合、図５で説明した条件の下で、インター符号化の選択を行うようにしたが、その他、例えば、図１８と同様の図１２に示すような条件の下で、インター符号化の選択を行うようにすることにより、双方向予測符号化モードを選択され易くすることも可能である。但し、この場合、図１８における場合よりも、定数ａおよびｃは大きくし、または定数ｂおよびｄは小さくするのが望ましい。
【０２０４】
なお、本件発明者が行ったシミュレーションによれば、図４におけるｑまたはｔは、図１２におけるａまたはｃそれぞれより小さい方が、また、図４におけるｒまたはｕは、図１２におけるｂまたはｄそれぞれより大きい方が、符号化効率が向上することが確認されている。さらに、予測誤差ＥｂおよびＥｆが小さい場合には、双方向予測符号化モードを用いないようにする方が、符号化効率が向上することも確認されている。
【０２０５】
【発明の効果】
以上の如く、本発明の画像符号化装置および画像符号化方法によれば、時間的に先行する過去参照画像に対する、符号化対象の画像の動きベクトルである順方向動きベクトルと、時間的に後行する未来参照画像に対する、符号化対象の画像の動きベクトルである逆方向動きベクトルとを検出し、過去参照画像または未来参照画像それぞれに対する、符号化対象の画像の予測残差に基づいて、符号化対象の画像の予測モードを順方向予測符号化モード、逆方向予測符号化モード、または双方向予測符号化モードのいずれかに決定し、予測モードに対応する動き補償を行うことにより、予測画像を生成し、符号化対象の画像と、予測画像との差分値を演算し、差分値を符号化し、過去参照画像または未来参照画像それぞれに対する予測残差をＥｆまたはＥｂとするとともに、α，β，γ，δを所定の定数とする場合において（但し、α，β，γ，δは実数であり、γ＜β）、式Ｅｂ＞α×ＥｆかつＥｂ＞β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像のみから予測画像を生成する順方向予測符号化モードに、予測モードを決定し、式Ｅｂ≦α×ＥｆかつＥｂ＜γ×Ｅｆ＋（１−α×γ）×δが成り立つとき、未来参照画像のみから予測画像を生成する逆方向予測符号化モードに、予測モードを決定し、式γ×Ｅｆ＋（１−α×γ）×δ≦ＥｂかつＥｂ≦β×Ｅｆ＋（１−α×β）×δが成り立つとき、過去参照画像および未来参照画像の両方から予測画像を生成する双方向予測符号化モードに、予測モードを決定し、順方向動きベクトルまたは逆方向動きベクトルの大きさが大きいほどにδを大きくすることで、双方向予測符号化モードが決定されにくくするようにした。従って、画像の動きに基づいて、効率的な符号化を行うことが可能となる。
【図面の簡単な説明】
【図１】ＧＯＰを示す図である。
【図２】ＧＯＰを示す図である。
【図３】ＩまたはＰピクチャとの距離によって、Ｂピクチャの予測残差が異なることを説明するための図である。
【図４】予測モードを選択する条件を説明するための図である。
【図５】双方向予測符号化モードが選択され易くする場合の、予測モードを選択する条件を示す図である。
【図６】本発明を適用した画像符号化装置の一実施の形態の構成を示すブロック図である。
【図７】図６に続くブロック図である。
【図８】図６および図７の画像符号化装置の処理を説明するための図である。
【図９】図６のスキャンコンバータ５の処理を説明するための図である。
【図１０】図７の予測モード決定回路２１の処理を説明するためのフローチャートである。
【図１１】図１０におけるステップＳ９の画像間距離判定処理の詳細を説明するためのフローチャートである。
【図１２】双方向予測符号化モードが選択され易くする場合の、予測モードを選択する条件を示す図である。
【図１３】動き補償予測符号化を説明するための図である。
【図１４】ＧＯＰを示す図である。
【図１５】ＭＰＥＧ符号化を説明するための図である。
【図１６】予測モードを選択する条件を示す図である。
【図１７】ＧＯＰを示す図である。
【図１８】予測モードを選択する条件を示す図である。
【符号の説明】
３画像符号化タイプ指定回路，４画像符号化順序替え回路，５スキャンコンバータ，６動きベクトル推定回路，７記憶部，７Ａ過去参照画像記憶部，７Ｂ現在画像記憶部，７Ｃ未来参照画像記憶部，８動き量算出回路，９カウンタ，１０画像間距離発生算出回路，１１演算部，１１Ａ乃至１１Ｃ演算器，１２ＤＣＴ回路，１３量子化回路，１４送信バッファ，１５可変長符号化回路，１６逆量子化回路，１７ＩＤＣＴ回路，１８演算器，１９フレームメモリ，１９Ａ未来参照画像記憶回路，１９Ｂ過去参照画像記憶回路，２０動き補償回路，２１予測モード決定回路，３１記録媒体，３２伝送路[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image encoding device and an image encoding method, and in particular, for example, a moving image is recorded on a recording medium such as a magneto-optical disk or a magnetic tape, or a video conference system, a video phone system, or a broadcasting device. The present invention relates to an image encoding device and an image encoding method suitable for use when, for example, moving images are transmitted from a transmission side to a reception side via a transmission line.
[0002]
[Prior art]
For example, when a moving image is digitized and recorded or transmitted, since the amount of data is enormous, image data is conventionally compression-encoded. As a typical coding method for moving images, there is motion compensation predictive coding.
[0003]
Motion compensated predictive coding is a coding method that uses correlation in the time axis direction of an image. As shown in FIG. 13, an image to be coded (encoding) with respect to a reference image (reference image) (reference frame). The motion vector of the target image) (current frame) is detected, and a predicted image is generated by performing motion compensation on the reference image that has already been encoded and decoded according to the motion vector. Then, the prediction residual of the encoding target image with respect to the prediction image is obtained, and the information amount of the moving image is compressed by encoding the prediction residual and the motion vector.
[0004]
As a specific example of motion compensation predictive coding, there is MPEG (Moving Picture Experts Group) coding. This is the common name of the video coding system compiled in WG (Working Group) 11 of SC (Sub Committee) 9 of JTC (Joint Technical Committee) 1 of ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission). It is.
[0005]
In MPEG, one frame or one field is divided into macroblocks composed of 16 lines × 16 pixels, and motion compensation predictive coding is performed in units of macroblocks.
[0006]
Here, motion compensation predictive coding is roughly divided into two coding methods: intra coding and inter coding (non-intra coding). In intra coding, information of itself is encoded as it is for a macroblock to be encoded, and in inter coding, a predicted image generated from the reference image using a frame (or field) at another time as a reference image. And the difference from its own information is encoded.
[0007]
In MPEG, each frame is encoded as one of an I picture (Intra coded picture), a P picture (Predictive coded picture), or a B picture (Bidirectionally predictive picture). In MPEG, processing is performed in GOP (Group Of Picture) units.
[0008]
That is, in MPEG, a GOP is composed of 17 frames as shown in FIG. 14, for example. When the frames constituting this GOP are F1, F2,..., F17 from the head, for example, as shown in the figure, the frame F1 is an I picture and the frame F2 is a B picture. As a result, the frame F3 is processed as a P picture. Subsequent frames F4 to F17 are alternately processed as B pictures or P pictures.
[0009]
While I pictures are intra-coded, P and B pictures are basically inter-coded. That is, as shown by the arrow in FIG. 14A, the P picture is basically inter-coded using the immediately preceding I or P picture as a reference image. As shown by the arrow in FIG. 14B, the B picture basically uses both the I or P picture immediately before and the P picture immediately after that as a reference picture. And inter-coded.
[0010]
More specifically, as shown in FIG. 15, first, the frame F1 is processed as an I picture. That is, all the macroblocks are intra-coded (SP1) and transmitted as transmission data F1X via a transmission path.
[0011]
Next, a frame F2 that is a B picture that may be a reference image is an image that follows in time (future image), and a frame F3 that is a P picture is processed first. With respect to the frame F3, a frame F1 that is the immediately preceding I picture is used as a reference image, and a prediction residual with respect to a prediction image generated from the reference image is obtained (forward prediction encoding) (SP3). Along with the motion vector x3 for F1, it is transmitted as transmission data F3X. Alternatively, the frame F3 is intra-encoded in the same manner as the frame F1 (SP1) and transmitted as transmission data F3X. Whether the P picture is intra-coded or forward-predicted coded can be switched on a macroblock basis.
[0012]
After encoding the frame F3, the frame F2, which is a B picture, is processed. The B picture is subjected to intra coding, forward prediction coding, backward prediction coding, or bidirectional prediction coding.
[0013]
That is, in intra coding, the frame F2 is transmitted as it is as the transmission data F2X as in the frame F1 (SP1).
[0014]
In forward predictive coding, a prediction residual for a predicted image generated from the reference image is obtained from the frame F2 using the frame F1 that is an I or P picture immediately preceding (temporally preceding) as a reference image. (Forward predictive coding) (SP3), this is transmitted as transmission data F2X together with the motion vector x1 for the frame F1.
[0015]
In backward predictive coding, a frame F2 is obtained with a prediction residual with respect to a predicted image generated from the reference image, using the frame F3, which is an I or P picture immediately following (temporally following), as a reference image. (Reverse predictive coding) (SP2), this is transmitted as transmission data F2X together with the motion vector x2 for the frame F3.
[0016]
In the bi-directional predictive coding, the frame F2 uses the frames F1 and F3 as reference images and obtains a prediction residual with respect to an average value of predicted images generated from the reference images (bidirectional predictive coding) (SP4). This is transmitted as transmission data F2X together with motion vectors x1 and x2 for frames F1 and F3.
[0017]
Whether a B picture is encoded by intra coding, forward predictive coding, reverse predictive coding, or bi-directional predictive coding is switched in units of macroblocks as in the case of P pictures. Can do.
[0018]
Also, with respect to intra coding, forward predictive coding, reverse predictive coding, and bidirectional predictive coding are referred to as inter coding (non-intra coding).
[0019]
Here, hereinafter, a reference image that precedes or follows in time is referred to as a past reference image or a future reference image.
[0020]
[Problems to be solved by the invention]
When encoding a macroblock of a B picture, the image coding apparatus has the highest coding efficiency among intra coding, forward prediction coding, backward prediction coding, or bidirectional prediction coding. It is desirable to have the prediction mode selected.
[0021]
Therefore, there is a method in which a B picture is encoded in each of the above four prediction modes, and the one with the smallest data amount obtained as a result is selected.
[0022]
However, since this method requires encoding in each of the four prediction modes, it takes time for processing or increases the scale of the apparatus.
[0023]
Therefore, a forward motion vector that is a motion vector of the encoding target image with respect to the past reference image and a backward motion vector that is a motion vector of the encoding target image with respect to the future reference image are detected (ME (Motion Estimation). )), And predicting images are obtained by motion compensation of the past reference image or the future reference image corresponding to the forward motion vector or the backward motion vector, respectively, and the prediction residual of the encoding target image for each predicted image is obtained. Corresponding to a difference (ME Error) (hereinafter also referred to as a motion vector estimation residual as appropriate), a method for determining a prediction mode of a B picture (precisely, three types of inter coding (forward prediction coding, The present applicant has previously proposed a method))) of selecting one of backward prediction encoding and bidirectional prediction encoding.
[0024]
In this method (hereinafter referred to as the first method as appropriate), first, for example, the difference between pixel values of a macroblock to be encoded and a predicted macroblock obtained by motion compensation of a reference image. Is obtained as a motion vector estimation residual.
[0025]
Then, when the motion vector estimation residual with respect to the past reference image or the future reference image is set to Ef or Eb, which of the inter coding is used is determined as shown in FIG. 16, for example.
[0026]
That is, when the equation Eb> j × Ef holds, the forward prediction encoding is selected, and when the equation Eb <k × Ef holds, the backward prediction encoding is selected. In other cases, that is, when the equation k × Ef ≦ Eb ≦ j × Ef holds, bi-directional predictive coding is selected.
[0027]
Note that 0 <k <j and j = 2 and k = 1/2 in FIG.
[0028]
Here, in the present specification, the symbols <and> may be set to symbols ≦ and ≧. Similarly, the symbols ≦ and ≧ may be the symbols <and>.
[0029]
Therefore, when the prediction residual Ef by the forward motion vector is relatively small compared to the prediction residual Eb by the backward motion vector (in FIG. 16, it is less than 1/2 (or less)), the forward prediction code. Is selected. Further, when the prediction residual Eb based on the backward motion vector is relatively smaller than the prediction residual Ef based on the forward motion vector (in FIG. 16, it is less than 1/2 (or less)), the backward prediction code. Is selected. Furthermore, when the ratio between the prediction residuals Ef and Eb is not so large or small (in FIG. 16, Ef / Eb is 1/2 or more (greater) and 2 or less (less)), bidirectional A prediction is selected.
[0030]
By the way, when the sequence of images is configured by arranging one (frame or field) B picture between I or P pictures as shown in FIG. Since the temporal distance from the I or P picture (I / P picture) that is the image or the future reference picture to the B picture is the same, the first method improves the coding efficiency. Can be planned.
[0031]
However, when an image sequence is configured by arranging two or more B pictures between I or P pictures, that is, for example, two B pictures are arranged as shown in FIG. In the inter coding, the bidirectional predictive coding is selected even though the forward predictive coding or the reverse predictive coding has the highest coding efficiency. There was a thing.
[0032]
This has been confirmed by a simulation performed by the present inventors.
[0033]
This is because, as shown in FIG. 17, the distance from the B picture to the I / P picture that becomes the past reference image or the future reference image is different.
[0034]
That is, when two B pictures are arranged, the distance to the future reference image is longer than the distance to the past reference image for the first B picture, and the second B picture On the other hand, for the picture, the distance to the past reference image is longer than the distance to the future reference image. Therefore, the prediction accuracy by the backward motion vector is degraded for the first B picture, and the prediction accuracy by the forward motion vector is degraded for the second B picture.
[0035]
Accordingly, the applicant of the present invention determines the prediction mode in consideration of the distance from the B picture to each of the past reference image or the future reference image, so that two or more images are inserted between the past reference image and the future reference image. A method that can efficiently encode an image even if a B picture is arranged (hereinafter referred to as the second method as appropriate) has been proposed previously (for example, Japanese Patent Application No. 7-210665). ).
[0036]
In the second method, the condition for selecting one of the inter-coding is changed depending on whether the B picture to be encoded is closer to the past reference image or the future reference image. ing.
[0037]
That is, when the B picture to be encoded is close to the past reference image (for example, frames F2, F5, F8,... In FIG. 17), as shown in FIG. When Ef holds, forward predictive coding is selected, and when Eb <b × Ef holds, reverse predictive coding is selected. In addition, when the formula b × Ef ≦ Eb ≦ a × Ef holds, bidirectional predictive coding is selected.
[0038]
However, 0 <b <a, and a is a value smaller than j in FIG. In FIG. 18A, a = 4/3 and b = 1/2.
[0039]
On the other hand, when the B picture to be encoded is close to the future reference picture (for example, frames F3, F6, F9,... In FIG. 17), as shown in FIG. 18B, the expression Eb> c × When Ef holds, forward predictive coding is selected, and when Eb <d × Ef holds, reverse predictive coding is selected. In addition, when the formula d × Ef ≦ Eb ≦ c × Ef holds, bi-directional predictive coding is selected.
[0040]
However, 0 <d <c, and d is a value larger than k in FIG. In FIG. 18B, c = 2 and d = 3/4.
[0041]
By doing as described above, when the B picture to be encoded is close to a past reference image, it is easy to select forward prediction encoding using only the past reference image, and is close to a future reference image. In this case, it is easy to select backward prediction encoding using only the future reference image. Therefore, it becomes easy to perform predictive coding using only a reference image with high prediction accuracy, and as a result, coding efficiency can be improved.
[0042]
However, according to the second method, for example, when an image with slow motion or an image with a certain simple motion such as an object panning in the horizontal direction is an encoding target, In some cases, the conversion efficiency decreased slightly.
[0043]
That is, for images with slow motion and images with a certain simple motion, prediction accuracy is higher with bidirectional predictive coding than with forward predictive coding or reverse predictive coding. Encoding efficiency is also increased. However, in the second method, as shown in FIG. 18, compared with the case in FIG. 16, the range in which bi-directional predictive coding is selected is narrowed, and forward predictive coding or reverse predictive code is selected. The range that can be selected is widened. Thus, according to the second method, even when encoding a slow motion image or an image having a certain simple motion, forward predictive encoding or reverse direction is more preferable than bidirectional predictive encoding. Predictive coding is easily selected, and as a result, coding efficiency is degraded.
[0044]
On the other hand, conventionally, without considering the bit amount necessary for motion vector transmission, the selection of inter coding (either forward prediction coding, backward prediction coding, or bidirectional prediction coding) One choice).
[0045]
That is, in the past, basically, the one with the smallest prediction residual among the forward prediction coding, the backward prediction coding, and the bidirectional prediction coding has been selected.
[0046]
However, for example, when any prediction residual for forward predictive coding, reverse predictive coding, and bi-directional predictive coding is small, even if the one for bi-directional predictive coding is the smallest. Considering the amount of bits required for motion vector transmission, forward prediction encoding or backward prediction encoding may be more efficient than bidirectional prediction encoding.
[0047]
Such a case often occurs, for example, when an image with a fast motion is encoded.
[0048]
The present invention has been made in view of such a situation, and is intended to further improve the coding efficiency of an image.
[0049]
[Means for Solving the Problems]
  The image coding apparatus according to the present invention includes a forward motion vector that is a motion vector of an image to be coded with respect to a past reference image that precedes in time, and an encoding target with respect to a future reference image that follows in time. Prediction of the encoding target image based on the prediction residual of the encoding target image with respect to the past reference image or the future reference image for each of the past reference image and the future reference image, and a motion vector detection unit that detects the backward motion vector that is the motion vector of the image A prediction mode determining means for determining a mode as one of a forward prediction encoding mode, a backward prediction encoding mode, or a bidirectional prediction encoding mode, and performing motion compensation corresponding to the prediction mode, thereby predicting a prediction image A motion compensation unit to be generated; a difference value calculation unit that calculates a difference value between an encoding target image and a predicted image; and an encoding unit that encodes the difference value. The prediction mode determination means uses Ef or Eb as the prediction residual for the past reference image or the future reference image, and α, β, γ, and δ as predetermined constants (provided that α, β, γ, δ is a real number, and γ <β), a forward prediction code that generates a predicted image from only past reference images when Eb> α × Ef and Eb> β × Ef + (1−α × β) × δ hold. Predictive mode is determined as the normalization mode, and when the expression Eb ≦ α × Ef and Eb <γ × Ef + (1−α × γ) × δ holds, reverse predictive coding for generating a predicted image only from the future reference image When the prediction mode is determined as the mode and the expressions γ × Ef + (1−α × γ) × δ ≦ Eb and Eb ≦ β × Ef + (1−α × β) × δ hold, the past reference image and the future reference image Predictive mode to bi-directional predictive coding mode that generates predictive images from both ConstantHowever, by increasing δ as the size of the forward motion vector or backward motion vector increases, the bi-directional predictive coding mode is less likely to be determined.It is characterized by that.
[0050]
  The image encoding method of the present invention is a method of encoding a forward motion vector that is a motion vector of an image to be encoded with respect to a past reference image that precedes in time and a future reference image that is temporally following. The backward motion vector that is the motion vector of the image is detected, and the prediction mode of the encoding target image is predicted in the forward direction based on the prediction residual of the encoding target image for each of the past reference image or the future reference image. By determining one of the encoding mode, the backward prediction encoding mode, or the bidirectional prediction encoding mode, and performing motion compensation corresponding to the prediction mode, a prediction image is generated, and an image to be encoded, The difference value with the predicted image is calculated, the difference value is encoded, the prediction residual for each of the past reference image or the future reference image is set to Ef or Eb, and α, β, γ, δ In the case where predetermined constants are used (where α, β, γ, and δ are real numbers and γ <β), the equation Eb> α × Ef and Eb> β × Ef + (1−α × β) × δ holds. When the prediction mode is determined as the forward predictive coding mode for generating the prediction image only from the past reference image, and the equation Eb ≦ α × Ef and Eb <γ × Ef + (1−α × γ) × δ holds Then, the prediction mode is determined as the backward prediction encoding mode for generating the prediction image only from the future reference image, and the expressions γ × Ef + (1−α × γ) × δ ≦ Eb and Eb ≦ β × Ef + (1−α) are determined. When (ββ) × δ holds, the prediction mode is determined as the bidirectional predictive coding mode for generating the prediction image from both the past reference image and the future reference image.However, by increasing δ as the size of the forward motion vector or backward motion vector increases, the bi-directional predictive coding mode is less likely to be determined.It is characterized by that.
[0051]
  In the image encoding device of the present invention, a forward motion vector that is a motion vector of an image to be encoded with respect to a past reference image that precedes in time and an encoding target with respect to a future reference image that follows in time The backward motion vector that is the motion vector of the current image is detected, and the prediction mode of the image to be encoded is forward based on the prediction residual of the image to be encoded for each past reference image or future reference image. A prediction image is generated by determining a prediction encoding mode, a backward prediction encoding mode, or a bidirectional prediction encoding mode, and performing motion compensation corresponding to the prediction mode. And the difference value with the predicted image is calculated, the difference value is encoded, and the prediction residual for each of the past reference image or the future reference image is set to Ef or Eb. In both cases, α, β, γ, and δ are predetermined constants (where α, β, γ, and δ are real numbers, γ <β), and the expressions Eb> α × Ef and Eb> β × Ef + ( When 1−α × β) × δ holds, the prediction mode is determined as the forward prediction encoding mode for generating the prediction image only from the past reference image, and the expression Eb ≦ α × Ef and Eb <γ × Ef + (1 When −α × γ) × δ holds, the prediction mode is determined as the backward prediction encoding mode for generating the prediction image only from the future reference image, and the expression γ × Ef + (1−α × γ) × δ ≦ Eb When Eb ≦ β × Ef + (1−α × β) × δ holds, the prediction mode is determined as the bidirectional predictive coding mode for generating the prediction image from both the past reference image and the future reference image.The larger the size of the forward motion vector or the backward motion vector is, the larger δ is, thereby making it difficult to determine the bidirectional predictive coding mode..
[0052]
  In the image coding method of the present invention, a forward motion vector that is a motion vector of an image to be coded with respect to a temporally preceding past reference image and a future reference image that is temporally followed The backward motion vector that is the motion vector of the current image is detected, and the prediction mode of the image to be encoded is forward based on the prediction residual of the image to be encoded for each past reference image or future reference image. A prediction image is generated by determining a prediction encoding mode, a backward prediction encoding mode, or a bidirectional prediction encoding mode, and performing motion compensation corresponding to the prediction mode. And the difference value with the predicted image is calculated, the difference value is encoded, and the prediction residual for each of the past reference image or the future reference image is set to Ef or Eb. In both cases, α, β, γ, and δ are predetermined constants (where α, β, γ, and δ are real numbers, γ <β), and the expressions Eb> α × Ef and Eb> β × Ef + ( When 1−α × β) × δ holds, the prediction mode is determined as the forward prediction encoding mode for generating the prediction image only from the past reference image, and the expression Eb ≦ α × Ef and Eb <γ × Ef + (1 When −α × γ) × δ holds, the prediction mode is determined as the backward prediction encoding mode for generating the prediction image only from the future reference image, and the expression γ × Ef + (1−α × γ) × δ ≦ Eb When Eb ≦ β × Ef + (1−α × β) × δ holds, the prediction mode is determined as the bidirectional predictive coding mode for generating the prediction image from both the past reference image and the future reference image.The larger the size of the forward motion vector or the backward motion vector is, the larger δ is, thereby making it difficult to determine the bidirectional predictive coding mode..
[0053]
DETAILED DESCRIPTION OF THE INVENTION
Embodiments of the present invention will be described below, but before that, in order to clarify the correspondence between the respective means of the invention described in the claims and the following embodiments, after each means, A corresponding embodiment (however, an example) is added in parentheses to describe the characteristics of the present invention, and the following is obtained.
[0054]
That is, the image encoding device according to claim 1 is for a forward motion vector that is a motion vector of an image to be encoded with respect to a past reference image that precedes in time, and a future reference image that follows in time. A motion vector detection means (for example, a motion vector estimation circuit 6 shown in FIG. 6) that detects a backward motion vector that is a motion vector of an image to be encoded, and corresponds to a forward motion vector or a backward motion vector Then, a prediction mode is generated by performing prediction mode determination means (for example, the prediction mode determination circuit 21 shown in FIG. 7) for determining the prediction mode of the image to be encoded and motion compensation corresponding to the prediction mode. Motion compensation means (for example, the motion compensation circuit 20 shown in FIG. 7) and difference value calculation means (for example, a difference value calculation means for calculating the difference value between the image to be encoded and the predicted image) 7) and encoding means for encoding the difference value (for example, the DCT circuit 12, the quantization circuit 13, the variable length encoding circuit 15 and the like shown in FIG. 7). Features.
[0055]
Of course, this description does not mean that the respective means are limited to those described above.
[0056]
Next, the principle of the present invention will be described.
[0057]
In a moving image, in general, the correlation between images in the time axis direction decreases as the distance (interval) between the images increases.
[0058]
Therefore, for example, in a sequence in which one B picture is arranged between I / P pictures as shown in FIG. 1 which is the same as FIG. 14, the B picture and the past reference image or the future reference image are respectively The correlation is equal, and as a result, the statistical properties of the motion vector estimation residuals Ef and Eb for the past reference image and the future reference image are also equal.
[0059]
On the other hand, for example, in a sequence in which two or more B pictures are arranged between I / P pictures as shown in FIG. 2 which is the same as FIG. 17, B pictures and past reference images or future reference images respectively The correlation changes in accordance with the distance.
[0060]
For this reason, for example, as shown in FIG._nAnd P_{n + 4}Between the three B pictures B_{n + 1}, B_{n + 2}, B_{n + 3}When these three B pictures B are placed_{n + 1}, B_{n + 2}, B_{n + 3}P picture P_nOr P_{n + 4}When each is predictively encoded as a past reference image or a future reference image, the past reference image P_nB picture for_{n + 1}, B_{n + 2}, B_{n + 3}Each motion vector residual E_f1, E_f2, E_f3Is generally E_f1<E_f2<E_f3It becomes a relationship.
[0061]
Similarly, future reference image P_{n + 4}B picture for_{n + 1}, B_{n + 2}, B_{n + 3}Each motion vector residual E_b1, E_b2, E_b3Is generally E_b1> E_b2> E_b3It becomes a relationship.
[0062]
As described above, when two or more B pictures are arranged between I / P pictures, the distance to each of the past reference image or the future reference image is different for each B picture. Is also different. As a result, the statistical properties of the motion vector residuals with respect to the past reference image or the future reference image are also different for each B picture. Therefore, in order to improve the encoding efficiency, the prediction mode when encoding each B picture is used. It is necessary to change the determination method according to its statistical properties.
[0063]
Next, the prediction accuracy by bidirectional predictive coding generally decreases as the motion of an image increases. For this reason, in the case of bi-directional predictive coding, considering that both the forward motion vector and the reverse motion vector must be transmitted, the bi-directional predictive code is used when the image motion is fast. Even when the prediction residual due to encoding is the smallest, if the predictive encoding is performed using only the reference image temporally closest to the B picture to be encoded, the total amount of data generated will be smaller There are many.
[0064]
On the other hand, for the speed of motion of an image, for example, a motion vector is represented as MV, and its x component (horizontal component) is represented by v._xAnd y component (vertical component) as v_yThe magnitude of the motion vector | MV | = (v_x ²+ V_y ²)^1/2Can be expressed as
[0065]
Therefore, when two B pictures are arranged between I / P pictures, for example, as shown in FIG. 2, the following corresponds to the magnitude of the motion vector | MV | By setting the prediction mode, encoding efficiency can be improved.
[0066]
That is, if the number of frames from the B picture to be encoded to the past reference picture or the future reference picture is Df or Db, respectively, when Df = 1 and Db = 2 (from the encoding target B picture) For example, as shown in FIG. 4A, the formula Eb> p × Ef and Eb> q × Ef + (1−p × q) × T_iIs satisfied, the forward predictive coding is selected, and the expression Eb ≦ p × Ef and Eb <r × Ef + (1−p × r) × T_iWhen the above holds, reverse prediction encoding is selected. Also, the formula r × Ef + (1−p × r) × T_i≦ Eb and Eb ≦ q × Ef + (1−p × q) × T_iIs selected, bi-directional predictive coding is selected.
[0067]
Where T_iIs a constant of 0 or more, 0 <r <q, and q is a value smaller than j in FIG. In FIG. 4A, q = 5/4 and r = 3/4. Further, p = 1.
[0068]
In this case, the prediction residual Ef is T_iLess than (below) or the prediction error Eb is p × T_iIf less than, bi-predictive coding is not selected. That is, in this case, bi-directional predictive encoding is performed when the prediction residual Ef is T_iOr the prediction error Eb is p × T_iIt can be selected only when this is the case.
[0069]
Therefore, in this case, as the magnitude of the motion vector | MV |_iIs set to a large value, it becomes difficult to select bidirectional predictive coding.
[0070]
That is, T₁<T₂<... <T_n<T_{n + 1}, And 0 <mv₀<Mv₁<... <mv_n-1<Mv_nThe magnitude of the motion vector | MV |₀Above mv₁If less than T_iT₁And mv₁Above mv₂If less than T_iT₂, ..., mv_n-1Above mv_nIf less than T_iT_nAnd mv_nAt times above, T_iT_{n + 1}Set to. By doing in this way, the faster the motion of the image, the lower the prediction accuracy, and it becomes difficult to select bi-directional predictive coding that greatly increases the amount of bits allocated to the motion vector, and as a result, improves the coding efficiency. be able to.
[0071]
Further, in this case, since the B picture to be encoded is close to the past reference image, it is easy to select the forward predictive encoding using only the past reference image. Can be improved.
[0072]
On the other hand, when Df = 2 and Db = 1 (when the distance from the encoding target B picture is closer to the future reference image), for example, as shown in FIG. 4B, the expression Eb> s × Ef and Eb> t × Ef + (1−s × t) × T_iIs satisfied, the forward predictive coding is selected, and the expression Eb ≦ s × Ef and Eb <u × Ef + (1−s × u) × T_iWhen the above holds, reverse prediction encoding is selected. Also, the equation u × Ef + (1−s × u) × T_i≦ Eb and Eb ≦ t × Ef + (1−s × t) × T_iIs selected, bi-directional predictive coding is selected.
[0073]
Here, 0 <u <t, and u is a value larger than k in FIG. In FIG. 4B, t = 4/3 and u = 4/5. Further, s = 1.
[0074]
Also in this case, the prediction residual Ef is T_iOr the prediction error Eb is s × T_iIf less than, bi-predictive coding is not selected. That is, in this case, bi-directional predictive encoding is performed when the prediction residual Ef is T_iOr the prediction error Eb is s × T_iIt can be selected only when this is the case.
[0075]
Therefore, as in the case described above, as the magnitude of the motion vector | MV |_iBy setting to a large value, bi-directional predictive coding becomes difficult to select, and as a result, coding efficiency can be improved.
[0076]
Further, in this case, since the B picture to be encoded is close to the future reference image, it is easy to select reverse prediction encoding using only the future reference image. Can be improved.
[0077]
Note that, when the motion of the image is slow, as described above, the prediction accuracy of the bidirectional predictive coding is high and the generated code amount is also small. Therefore, it is desirable to select the bidirectional predictive coding. Therefore, the magnitude | MV | of the motion vector is a predetermined value mv₀For example, as shown in FIG. 5 which is the same as FIG. 16, when the formula Eb> j × Ef is satisfied, the forward predictive coding is selected, and the formula Eb <k × Ef is satisfied. If the formula k × Ef ≦ Eb ≦ j × Ef holds, the bidirectional predictive coding is selected.
[0078]
That is, in FIG. 4, for example, t = q = j, r = u = k, T_i= 0.
[0079]
By doing so, the magnitude of the motion vector | MV |₀When it becomes less than, it becomes easy to select bi-directional predictive coding with high prediction accuracy, and as a result, coding efficiency can be improved.
[0080]
In addition to the magnitude of the motion vector | MV |, for example, the speed of motion of the image is, for example, the sum of the absolute value of the x component and the absolute value of the y component | x | + | y | Is also reflected. Therefore, the above constant T_iCan be set corresponding to the sum of absolute values | x | + | y | of this component.
[0081]
Next, the prediction accuracy by bidirectional predictive coding varies depending on the complexity of the image as well as the speed of motion. That is, the prediction accuracy by bidirectional predictive coding basically increases when the motion of the image is a certain simple object such as an object panning in the horizontal direction, and decreases as it becomes more complicated. .
[0082]
For this reason, in the case of bi-directional predictive coding, considering that both the forward motion vector and the reverse motion vector must be transmitted, bi-directional prediction is used when the motion of the image is complicated. Even when the prediction residual by encoding is the smallest, if the distance from the B picture to be encoded to the temporally closest reference image (the distance from the past reference image or the future reference image is equal, one of them) On the other hand, it is often the case that predictive encoding is performed using only 1) and the total amount of data generated is reduced.
[0083]
On the other hand, for example, in an image in which an object is moving in parallel, the directions of the forward motion vector and the backward motion vector are reversed. That is, the sign of the x or y component of the forward motion vector and the sign of the x or y component of the backward motion vector (the sign of the x component and the sign of the y component) are different from each other. .
[0084]
On the other hand, when the object moves in a complicated manner, at least one of the codes of the x components and the codes of the y components is the same.
[0085]
Therefore, for example, if the x component or y component of the forward motion vector is Fx or Fy, respectively, and the x component or y component of the backward motion vector is Bx or By, respectively, the SMV represented by the following equation: Reflects the complexity of the motion of the image.
[0086]
SMV = | Fx + Bx | + | Fy + By |
[0087]
Note that this SMV changes in accordance with the complexity of the motion of the image, and becomes small when the prediction accuracy of both the forward prediction coding and the backward prediction coding is high. When is low, it tends to increase.
[0088]
Therefore, when two B pictures are arranged between I / P pictures, for example, as shown in FIG. 2, the prediction mode is set as follows corresponding to the SMV. As a result, the encoding efficiency can be improved.
[0089]
That is, first, in the case of Df = 1 and Db = 2, for example, as shown in FIG. 4A, the formula Eb> p × Ef and Eb> q × Ef + (1−p × q) × T_iIs satisfied, the forward predictive coding is selected, and the expression Eb ≦ p × Ef and Eb <r × Ef + (1−p × r) × T_iWhen the above holds, reverse prediction encoding is selected. Also, the formula r × Ef + (1−p × r) × T_i≦ Eb and Eb ≦ q × Ef + (1−p × q) × T_iIs selected, bi-directional predictive coding is selected.
[0090]
In this case, as described above, the prediction residual Ef is T_iOr the prediction error Eb is p × T_iIf less than, bi-predictive coding is not selected. That is, in this case, bi-directional predictive encoding is performed when the prediction residual Ef is T_iOr the prediction error Eb is p × T_iIt can be selected only when this is the case.
[0091]
Therefore, in this case, as the SMV increases, the constant T_iIs set to a large value, it becomes difficult to select bidirectional predictive coding.
[0092]
That is, 0 <MV₀<MV₁<... <MV_n-1<MV_nThe SMV is MV₀MV₁If less than T_iT₁And MV₁MV₂If less than T_iT₂, MV_n-1MV_nIf less than T_iT_nAnd MV_nAt times above, T_iT_{n + 1}Set to. By doing in this way, it becomes difficult to select the bi-directional predictive coding in which the prediction accuracy decreases as the motion of the image becomes complicated, and as a result, the coding efficiency can be improved.
[0093]
Further, in this case, since the B picture to be encoded is close to the past reference image, it is easy to select the forward predictive encoding using only the past reference image. Can be improved.
[0094]
On the other hand, when Df = 2 and Db = 1, for example, as shown in FIG. 4B, the expressions Eb> s × Ef and Eb> t × Ef + (1−s × t) × T_iIs satisfied, the forward predictive coding is selected, and the expression Eb ≦ s × Ef and Eb <u × Ef + (1−s × u) × T_iWhen the above holds, reverse prediction encoding is selected. Also, the equation u × Ef + (1−s × u) × T_i≦ Eb and Eb ≦ t × Ef + (1−s × t) × T_iIs selected, bi-directional predictive coding is selected.
[0095]
Also in this case, the prediction residual Ef is T_iOr the prediction error Eb is s × T_iIf less than, bi-predictive coding is not selected. That is, in this case, bi-directional predictive encoding is performed when the prediction residual Ef is T_iOr the prediction error Eb is s × T_iIt can be selected only when this is the case.
[0096]
Therefore, as in the case described above, as the SMV increases, the constant T_iBy setting to a large value, bi-directional predictive coding becomes difficult to select, and as a result, coding efficiency can be improved.
[0097]
Further, in this case, since the B picture to be encoded is close to the future reference image, it is easy to select reverse prediction encoding using only the future reference image. Can be improved.
[0098]
In addition, when the movement of the image is very simple, that is, for example, when the object is translated in a certain direction, SMV becomes a very small value (ideally, 0) ). In this case, as described above, since the prediction accuracy of bidirectional predictive coding is high and the amount of generated codes is small, it is desirable to select bi-directional predictive coding. Therefore, SMV is a predetermined value MV.₀For example, as shown in FIG. 5 which is the same as FIG. 16, when the formula Eb> j × Ef is satisfied, the forward predictive coding is selected, and the formula Eb <k × Ef is satisfied. If the formula k × Ef ≦ Eb ≦ j × Ef holds, the bidirectional predictive coding is selected.
[0099]
That is, in FIG. 4, for example, t = q = j, r = u = k, T_i= 0.
[0100]
By doing this, SMV becomes MV₀When it becomes less than, it becomes easy to select bi-directional predictive coding with high prediction accuracy, and as a result, coding efficiency can be improved.
[0101]
In addition, as an example of a case where the motion of the image is very simple, there is an image shot by panning a video camera. In this case, the x component of the motion vector is much larger than the y component. Thus, for example, when g | y || g | y | is satisfied when g is a predetermined constant (a value larger than 1, for example, 4), bi-predictive coding is selected as described above. It is possible to facilitate this. This also applies to the case where the expression g | x | <| y | holds.
[0102]
As described above, by selecting (deciding) the prediction mode adaptively according to the speed and complexity of the motion of the image, the encoding efficiency can be improved as compared with the conventional case.
[0103]
In the above case, two B pictures are arranged between I / P pictures. However, only one or three or more B pictures are arranged between them. The same can be said for.
[0104]
Next, FIG. 6 and FIG. 7 show a configuration of an embodiment of an image encoding device to which the present invention is applied.
[0105]
This image encoding apparatus determines a prediction mode corresponding to, for example, SMV reflecting the complexity of motion of the image described above, and the image is a hybrid code combining motion compensation and DCT (Discrete Cosine Transform). It is made to become.
[0106]
That is, the image data to be encoded is supplied to the image encoding type designation circuit 3 in units of frames (or fields), for example. The image coding type designation circuit 3 designates whether the frame input thereto is to be processed as an I, P, or B picture (hereinafter referred to as a picture type as appropriate).
[0107]
Specifically, for example, as shown in FIG. 8A, the image coding type designating circuit 3 processes 16-frame images F1 to F16 input thereto as 1 GOP data, ), The first frame F1 is designated as the I picture, the second and third frames F2 and F3 are designated as the B picture, and the fourth frame F4 is designated as the P picture. Further, the image coding type designating circuit 3 designates the fifth and sixth frames F5 and F6 as B pictures and the seventh frame F7 as P pictures, and thereafter the remaining frames F8 to F16 are designated similarly. Are designated as B or P pictures.
[0108]
In FIG. 8B (the same applies to FIG. 8C), the subscript numbers attached to I, P, and B correspond to temporal referencd in MPEG, and each frame Indicates the display order.
[0109]
The frame in which the picture type is designated in the image coding type designation circuit 3 is output to the image coding order changing circuit 4. In the image encoding order changing circuit 4, the arrangement of frames is rearranged in the encoding order. That is, since the B picture may be decoded on the receiving side using an image displayed after it is displayed as a reference image (future reference image), the future reference image has already been decoded. Otherwise, the B picture cannot be decoded. Therefore, the image encoding order changing circuit 4 changes the arrangement of the frames constituting the GOP so that the frame serving as the future reference image is encoded before the B picture.
[0110]
Specifically, for example, rearrangement is performed as shown in FIG.
[0111]
The sequence of frames rearranged by the image encoding order changing circuit 4 is supplied to the scan converter 5. The scan converter 5 converts a frame input by raster scanning into a block format signal.
[0112]
That is, for example, frame format image data in which only V lines are collected from H dots is input to the scan converter 5. Then, the scan converter 5 divides the image data into N slices composed of 16 lines as shown in FIG. 9A (accordingly, V = 16 × N), and further, As shown in FIG. 5B, each slice is divided into 16 macro blocks and divided into M macro blocks (hence, here, H = 16 × M).
[0113]
Therefore, each macro block is composed of luminance signals corresponding to 16 × 16 dots. As shown in FIG. 9C, the macro block is divided into luminance signals Y [1] to Y [4] corresponding to 8 × 8 dots, and the macro block is divided into 8 × 8 dots. Corresponding color difference signals Cb [5] and Cr [6] are associated. In the DCT circuit 12 (FIG. 7), which will be described later, DCT processing is performed in units of blocks of 8 × 8 dots.
[0114]
As described above, the macroblock obtained by the scan converter 5 is supplied to the arithmetic unit 11 in FIG.
[0115]
Returning to FIG. 6, the counter 9 counts the frame synchronization signal output from the image coding order changing circuit 4.
[0116]
That is, the image encoding order changing circuit 4 outputs a frame synchronization signal to the counter 9 at a timing when the rearranged frames are output to the scan converter 5. Further, the image coding order changing circuit 4 detects the picture type TYPE of the frame to be output to the scan converter 5 and outputs it to the motion vector estimation circuit 6, the counter 9, and the prediction mode determination circuit 21 in FIG.
[0117]
The counter 9 counts the frame synchronization signal output from the image encoding order changing circuit 4 and outputs the count value CNT to the inter-image distance generation circuit 10. The counter 9 is configured to reset the count value CNT to 0, for example, when the picture type TYPE output from the image coding order changing circuit 4 is an I or P picture.
[0118]
Accordingly, the count value CNT output from the counter 9 represents the number of B pictures arranged between the I or P pictures.
[0119]
Here, in the present embodiment, as shown in FIG. 8B, since two B pictures are arranged between the I or P pictures, the count value CNT output from the counter 9 is As shown in FIG. 4D, it becomes 0, 1 or 2.
[0120]
The inter-image distance generation circuit 10 is based on the count value CNT from the counter 9 and the distance (number of frames) from the B picture to each of the past reference image or the future reference image used for the predictive encoding (inter encoding). Df or Db is calculated and output to the prediction mode determination circuit 21 of FIG.
[0121]
That is, the inter-image distance generation circuit 10 outputs the same value as the count value CNT as the distance Df to the past reference image, as shown in FIG. 8E, and as the distance Db to the future reference image. As shown in FIG. 8F, a value obtained by arranging the count values CNT in reverse is output.
[0122]
On the other hand, the motion vector estimation circuit 6 detects (estimates) the forward motion vector MVf and the backward motion vector MBb, and further, predicts residuals (motion vector estimation) for each of the forward motion vector MVf and the backward motion vector MVb. Residual) Ef or Eb is calculated.
[0123]
That is, the motion vector estimation circuit 6 is supplied with a frame in which the picture type TYPE is designated and the picture type TYPE from the image coding order changing circuit 4.
[0124]
The motion vector estimation circuit 6 uses the frame supplied from the image encoding reordering circuit 4 according to the picture type TYPE, the past reference image storage unit 7A, the current image storage unit 7B, or the future reference constituting the storage unit 7. A motion vector is detected for an image stored in any of the image storage units 7C and stored in the current image storage unit 7B.
[0125]
Specifically, the motion vector estimation circuit 6 is, for example, in the case shown in FIG.₁Is stored in the past reference image storage unit 7A, and P_FourIs stored in the current image storage unit 7B.₁P as a past reference image_FourMotion vector (forward motion vector) MVf is detected, and the prediction residual Ef is obtained. Next, P currently stored in the image storage unit 7B_FourIs transferred to the future reference image storage unit 7C, and B₂Is stored in the current image storage unit 7B.₁Or P_Four, As past reference images or future reference images,₂Forward motion vector MVf or backward motion vector MVb is detected, and the respective prediction residuals Ef or Eb are obtained.
[0126]
Next, B_ThreeIs stored in the current image storage unit 7B, and, as in the case described above, B_ThreeForward motion vector MVf or backward motion vector MVb is detected, and the respective prediction residuals Ef or Eb are obtained.
[0127]
Thereafter, P stored in the future reference image storage unit 7C._FourIs transferred to the past reference image storage unit 7A and stored (overwritten), and P₇Is stored in the current image storage unit 7B, so that P_FourP as a past reference image₇Motion vector MVf is detected, and the prediction residual Ef is obtained.
[0128]
Next, P currently stored in the image storage unit 7B₇Is transferred to the future reference image storage unit 7C, and B_FiveIs stored in the current image storage unit 7B, so that P_FourOr P₇, As past reference images or future reference images,_FiveForward motion vector MVf or backward motion vector MVb is detected, and the respective prediction residuals Ef or Eb are obtained. Hereinafter, similarly, detection of a motion vector and calculation of a prediction residual are performed.
[0129]
Here, a method of calculating the prediction errors Ef and Eb will be described.
[0130]
Now, let a certain macroblock be a target macroblock, and the pixel value of the i-th pixel from the left and the j-th pixel from the top constituting the target macroblock is A_ijAnd the pixel value of the i-th pixel from the left and the j-th pixel from the top constituting the 16 × 16 range of the past reference image that most closely approximates the target macroblock is F_ijIt expresses. In this case, the prediction error Ef is calculated according to the following equation, for example.
[0131]
Ef = Σ | A_ij-F_ij｜
In the above equation, Σ represents a summation with i and j changed from 1 to 16.
[0132]
In addition, the pixel value of the i-th pixel from the left and the j-th pixel from the top constituting the 16 × 16 range of the future reference image that most closely approximates the target macroblock is represented by B_ijIn other words, the prediction error Eb is calculated according to the following equation, for example.
[0133]
Eb = Σ | A_ij-B_ij｜
In the above formula, Σ represents the summation with i and j changed to 1 to 16.
[0134]
The motion vectors MVf and MBb and the prediction errors Ef and Eb obtained as described above are supplied to the prediction mode determination circuit 21 in FIG. The motion vectors MVf and MBb are also supplied to the variable length coding circuit 15 and the motion compensation circuit 20 in FIG. Further, the motion vectors MVf and MBb for the B picture are also supplied to the motion amount calculation circuit 8.
[0135]
In the motion amount calculation circuit 8, the SMV described above is calculated from the motion vectors MVf and MBb and supplied to the prediction mode determination circuit 21 in FIG.
[0136]
In the prediction mode determination circuit 21 of FIG. 7, the prediction mode of the macroblock is determined based on the distances Df and Db, the motion vectors MVf and MVb, the picture type TYPE, and the SMV.
[0137]
That is, when the picture type TYPE is an I picture, that is, when the encoding target macroblock is an I picture, the prediction mode determination circuit 21 determines the prediction mode as an intra encoding mode.
[0138]
When the picture type TYPE is a P picture, that is, when the encoding target macroblock is a P picture, the prediction mode determination circuit 21 sets the prediction mode to the intra coding mode or the order as follows. One of the direction prediction encoding modes is determined.
[0139]
In other words, in this case, the prediction mode determination circuit 21 first determines, for example, E defined by the following equation as a prediction residual at the time of intra coding._intraIs calculated.
[0140]
E_intra= Σ | A_ij-A_av｜
In the above formula, A_ijRepresents the pixel value of the i-th pixel from the left and the j-th pixel from the top constituting the macroblock to be encoded, and A_avRepresents the average value. Also, Σ represents summation with i and j changed to 1-16.
[0141]
Then, the prediction mode determination circuit 21 receives the prediction residual E during intra coding._intraIs smaller than the prediction residual Ef in the forward predictive coding (when below), the prediction mode is determined to be the intra coding mode. Also, the prediction residual E during intra coding_intraIs greater than or equal to the prediction residual Ef in forward predictive coding (when larger), the predictive mode is determined as the forward predictive predictive mode.
[0142]
Next, when the picture type TYPE is a B picture, that is, when the encoding target macroblock is a B picture, the prediction mode determination circuit 21 sets the prediction mode to the intra coding mode, as follows. A forward prediction encoding mode, a backward prediction encoding mode, or a bidirectional predictive encoding mode is determined.
[0143]
That is, first, the prediction mode determination circuit 21 selects (determines) one of inter coding, that is, one of the forward prediction coding mode, the backward prediction coding mode, and the bidirectional prediction coding mode.
[0144]
This selection is performed based on SMV, prediction residuals Ef and Eb, distances Df and Db, and motion vectors MVf and MVb.
[0145]
That is, first, corresponding to SMV, the constant T described in FIG._iIs set. Then, either one of FIG. 4 (A) or FIG. 4 (B) is selected corresponding to the distances Df and Db, and in the selected one, the prediction residuals Ef and Eb are as described above. One of the forward prediction coding mode, the backward prediction coding mode, and the bidirectional prediction coding mode is selected based on the magnitude relationship.
[0146]
Note that SMV is a predetermined value MV.₀As described above, the prediction described with reference to FIG. 5 is performed in the following cases, or when one of the absolute values of the x and y components of the motion vectors MVf and MVb is sufficiently larger than the other. Based on the magnitude relationship between the residuals Ef and Eb, one of the forward predictive coding mode, the reverse predictive coding mode, and the bidirectional predictive coding mode is selected.
[0147]
Then, the prediction residual corresponding to the prediction mode selected from the inter coding is obtained as the prediction residual E for the inter coding._interIt is said. When the bi-directional predictive coding mode is selected, the prediction residual E_interIs, for example, an average value of the prediction residuals Ef and Eb. Therefore, when the forward predictive coding mode, the reverse predictive coding mode, or the bidirectional predictive coding mode is selected, the prediction residual E_interAre Ef, Eb, or (Ef + Eb) / 2, respectively.
[0148]
Further, in the prediction mode determination circuit 21, the prediction residual E at the time of intra-coding is performed in the same manner as described above._intraIs calculated. Then, the prediction mode determination circuit 21 receives the prediction residual E during intra coding._intraIs the prediction residual E of the one selected from the inter coding_interWhen smaller, the prediction mode is determined as the intra coding mode. Also, the prediction residual E during intra coding_intraIs the prediction residual E_interAt this time, the prediction mode is determined to be selected from the inter coding.
[0149]
Therefore, for the B picture, the prediction mode is adaptively determined in accordance with the motion complexity of the image and the distance to the reference image, so that the coding efficiency can be further improved. It becomes.
[0150]
The prediction mode determined as described above is supplied from the prediction mode determination unit 21 to the calculation unit 11, the variable length coding circuit 15, and the motion compensation circuit 20.
[0151]
A macroblock (macroblock to be encoded) to be predictively encoded in the prediction mode supplied from the prediction mode determination circuit 21 is supplied from the scan converter 5 of FIG. The computing unit 11 includes computing units 11A to 11C and a switch SW, and the switch SW is switched corresponding to the prediction mode.
[0152]
That is, when an I-picture macroblock is input to the calculation unit 11, the prediction mode is an intra coding mode. In this case, the switch SW selects the terminal a. The terminal a is supplied with the macroblock to be encoded as it is. Therefore, this macroblock is supplied to the DCT circuit 12 through the terminal a.
[0153]
In the DCT circuit 12, the macroblock from the calculation unit 11 is subjected to DCT processing, and thereby converted into DCT coefficients. This DCT coefficient is supplied to the quantization circuit 13, where it is quantized at a predetermined quantization step and then supplied to the variable length coding circuit 15.
[0154]
The variable length coding circuit 15 is supplied with the quantized DCT coefficients from the quantization circuit 13, and similarly, the quantization step from the quantization circuit 13 and the prediction mode from the prediction mode determination circuit 21 are as shown in FIG. Motion vectors MVf and MVb are respectively supplied from the motion vector estimation circuit 6. The variable length coding circuit 15 appropriately converts these data into a variable length code such as a Huffman code and outputs the data to the transmission buffer 14.
[0155]
The transmission buffer 14 temporarily stores the variable length code from the variable length encoding circuit 15 and outputs it at a constant data rate, for example. The variable length code output from the transmission buffer 14 is recorded on a recording medium 31 such as an optical disk, a magneto-optical disk, a magnetic disk, an optical card, a magnetic tape, a phase change disk, or a satellite line, a ground wave, It is transmitted via a transmission path 32 such as a CATV network or the Internet.
[0156]
The transmission buffer 14 supplies (feeds back) the accumulated amount of data to the quantization circuit 13. The quantization circuit 13 is configured to set a quantization step based on the accumulated amount. That is, the quantization circuit 13 increases the quantization step when the transmission buffer 14 is likely to overflow, thereby reducing the data generation amount. Further, when the transmission buffer 14 is likely to underflow, the quantization circuit 13 reduces the quantization step, thereby increasing the data generation amount. As described above, overflow and underflow of the transmission buffer 14 are prevented.
[0157]
On the other hand, the quantized DCT coefficient output from the quantization circuit 13 and the quantization step are supplied to the inverse quantization circuit 16 in addition to the variable length coding circuit 15. The inverse quantization circuit 16 inversely quantizes the quantized DCT coefficient from the quantization circuit 13 in the same quantization step from the quantization circuit 13, and outputs the resulting DCT coefficient to the IDCT circuit 17. .
[0158]
In the IDCT circuit 17, the DCT coefficient from the inverse quantization circuit 16 is subjected to inverse DCT processing, whereby image data having substantially the same value as the output of the computing unit 11 is restored and supplied to the computing unit 18. If the image data input thereto is to be intra-encoded, the arithmetic unit 18 outputs the image data as it is to the frame memory 19 for storage without performing any particular processing.
[0159]
The frame memory 19 includes a future reference image storage circuit 19A and a past reference image storage circuit 19B that store an image used as a future reference image or a past reference image. The frame memory 19 is first encoded and decoded. The I picture is stored in the past reference image storage circuit 19B.
[0160]
Next, when the macroblock input to the calculation unit 11 is a P picture and the prediction mode is the intra coding mode, the switch SW selects the terminal a. Therefore, in this case, the macroblock of the P picture is encoded in the same manner as in the case of the I picture described above, and is locally decoded and supplied to the frame memory 19. The P picture encoded and decoded next to the I picture is stored in the future reference image storage circuit 19A.
[0161]
On the other hand, when the macroblock input to the calculation unit 11 is a P picture, the switch SW selects the terminal b when the prediction mode is the forward predictive coding mode. The output of the computing unit 11A is supplied to the terminal b, and the macro block to be encoded and the output of the motion compensation circuit 20 are supplied to the computing unit 11A. ing.
[0162]
When the prediction mode is the forward predictive coding mode, the motion compensation circuit 20 reads an image (in this case, an I picture) stored in the past reference image storage circuit 19B as a past reference image, and follows the motion vector MVf. A predicted image is generated by performing motion compensation. That is, the motion compensation circuit 20 reads from the past reference image storage circuit 19B data at an address shifted from the position corresponding to the macroblock to be encoded by the amount corresponding to the motion vector MVf, and calculates this as a predicted image. To the vessel 11A.
[0163]
The computing unit 11A subtracts the corresponding pixel value constituting the predicted image from each pixel value constituting the macro block to be encoded, and outputs the subtraction value (difference value). Therefore, in this case, the arithmetic unit 11 supplies the DCT circuit 12 with a difference value between the encoding target macroblock and the predicted image obtained from the past reference image. This difference value is encoded and output in the same manner as in intra encoding.
[0164]
Further, this difference value is restored to a value almost the same as the original value through the DCT circuit 12, the quantization circuit 13, the inverse quantization circuit 16, and the IDCT circuit 17, as in the case described above. It is supplied to the calculator 18.
[0165]
In this case, the computing unit 18 is supplied with the same data as the predicted image supplied to the computing unit 11A from the motion compensation circuit 20, and the computing unit 18 uses the restored difference value, the predicted image, and Are added, whereby the P picture is locally decoded. The locally decoded P picture is supplied to and stored in the frame memory 19.
[0166]
Note that the P picture encoded and decoded next to the I picture is stored in the future reference image storage circuit 19A as described above.
[0167]
Next, when the macroblock input to the calculation unit 11 is a B picture and the prediction mode is the intra coding mode or the forward prediction coding mode, the switch SW selects the terminal a or b, respectively. Therefore, in this case, the macroblock of the B picture is encoded in the same manner as described above.
[0168]
On the other hand, when the macroblock input to the calculation unit 11 is a B picture and the prediction mode is the backward predictive coding mode, the switch SW selects the terminal c. The output of the computing unit 11B is supplied to the terminal c, and the macro block to be encoded and the output of the motion compensation circuit 20 are supplied to the computing unit 11B. ing.
[0169]
When the prediction mode is the backward predictive coding mode, the motion compensation circuit 20 reads an image (in this case, a P picture) stored in the future reference image storage circuit 19A as a future reference image, and follows the motion vector MVb. A predicted image is generated by performing motion compensation. In other words, the motion compensation circuit 20 reads data at an address shifted from the position corresponding to the macroblock to be encoded by the amount corresponding to the motion vector MVb from the future reference image storage circuit 19A, and calculates this as a predicted image. To the vessel 11B.
[0170]
The computing unit 11B subtracts the corresponding pixel value constituting the prediction image from each pixel value constituting the macro block to be encoded, and outputs the subtraction value (difference value). Therefore, in this case, the arithmetic unit 11 supplies the DCT circuit 12 with a difference value between the macro block to be encoded and the predicted image obtained from the future reference image. This difference value is encoded and output in the same manner as in intra encoding.
[0171]
Further, when the macroblock input to the calculation unit 11 is a B picture and the prediction mode is the bidirectional predictive coding mode, the switch SW selects the terminal d. The output of the arithmetic unit 11C is supplied to the terminal d, and the macro block to be encoded and the output of the motion compensation circuit 20 are supplied to the arithmetic unit 11C. ing.
[0172]
When the prediction mode is the bidirectional predictive coding mode, the motion compensation circuit 20 reads an image (in this case, an I picture) stored in the past reference image storage circuit 19B as a past reference image, and follows the motion vector MVf. A predicted image (hereinafter, appropriately referred to as a past predicted image) is generated by performing motion compensation, and an image (P picture in this case) stored in the future reference image storage circuit 19A is read as a future reference image. By performing motion compensation according to the motion vector MVb, a predicted image (hereinafter, appropriately referred to as a future predicted image) is generated. The past predicted image and the future predicted image are supplied to the computing unit 11C.
[0173]
First, the calculator 11C calculates, for example, an average value (hereinafter, appropriately referred to as an average predicted image) of the past predicted image and the future predicted image supplied from the motion compensation circuit 20. Then, the computing unit 11C subtracts the corresponding pixel value constituting the average predicted image from each pixel value constituting the macro block to be encoded, and outputs the subtraction value (difference value). Accordingly, in this case, the difference value between the macro block to be encoded and the average predicted image is supplied from the calculation unit 11 to the DCT circuit 12. This difference value is encoded and output in the same manner as in intra encoding.
[0174]
In the present embodiment, the B picture is not used as a reference image when other images are encoded, and thus is not locally decoded (it is not necessary). In addition, the past reference image storage circuit 19A and the future reference image storage circuit 19B are configured to be able to perform bank switching as necessary. As a result, the past reference image storage circuit 19A and the future reference image storage circuit 19B The stored image data can be used as both a past reference image and a future reference image. Further, the above-described processing is performed on all of the luminance signal Y and the color difference signals Cb and Cr. However, for the color difference signals Cb and Cr, for example, the motion vector used when the luminance signal Y is processed is halved.
[0175]
Next, the process (prediction mode determination process) of the prediction mode determination circuit 21 of FIG. 7 will be further described with reference to the flowchart of FIG.
[0176]
In the prediction mode determination circuit 21, the process according to the flowchart of FIG. 10 is performed for each macroblock.
[0177]
That is, in the prediction mode determination circuit 21, first, in step S1, the SMV is changed to the threshold value MV.₀Whether it is less than is determined. In step S1, SMV is a threshold value MV.₀If it is determined that the value is less than 1, the process proceeds to step S2, and one of the inter-coding is selected as described below with reference to FIG.
[0178]
That is, in step S2, it is determined whether the prediction residual Eb is larger than j times the prediction residual Ef (j × Ef). When it is determined in step S2 that Eb is larger than j × Ef, the process proceeds to step S3, where forward predictive coding is selected as inter coding, and the process ends.
[0179]
After that, as described above, the final prediction mode is determined based on the magnitude relationship between the prediction residual for the selected inter coding and the prediction residual for the intra coding.
[0180]
On the other hand, if it is determined in step S2 that Eb is not greater than j × Ef, the process proceeds to step S4, where it is determined whether the prediction residual Eb is less than k times the prediction residual Ef (k × Ef). Is done. When it is determined in step S4 that Eb is less than k × Ef, the process proceeds to step S5, and reverse prediction encoding is selected as inter encoding, and the process ends.
[0181]
If it is determined in step S4 that Eb is not less than k × Ef, that is, if Eb is greater than or equal to k × Ef and less than or equal to j × Ef, the process proceeds to step S6 to perform bidirectional prediction as inter coding. Encoding is selected and the process ends.
[0182]
Note that the prediction mode determination circuit 21 determines, for example, the expression | x |> g | y | or | for the x component and the y component of the forward motion vector MVf or the backward motion vector MVb before performing the process of step S1. It is determined whether or not y |> g | x | is satisfied, and if so, SMV is set to MV such as 0.₀It is designed to force the setting to a value less than that. Therefore, for example, for an image in which an object is moving in a substantially horizontal or vertical direction, as described with reference to FIG. 5, selection of inter coding is performed under the condition that bi-directional predictive coding is easily selected. Is done.
[0183]
On the other hand, in step S1, SMV is MV.₀If it is determined that it is not less than, step S7₁In the following, inter-coding selection is performed as described in FIG.
[0184]
That is, step S7₁So SMV is MV₀MV₁Whether it is less than is determined. Step S7₁SMV is MV₀MV₁If it is determined that the value is less than step S8, step S8 is performed.₁The constant Ti is T₁The process proceeds to step S9.
[0185]
Step S7₁SMV is MV₀MV₁If it is determined that it is not less than, step S7₂To SMV, MV₁MV₂Whether it is less than is determined.
[0186]
Hereinafter, similarly, step S7_cSo SMV is MV_c-1MV_cAnd whether the SMV is MV_c-1MV_cIf it is less, step S8_cThe constant Ti is T_cThe process proceeds to step S9. Also, SMV is MV_c-1MV_cIf not, step S7_{c + 1}Proceed to
[0187]
And step S7_nSMV is MV_n-1MV_nIf it is determined that it is not less than, that is, SMV is MV_nIn the above case, step S8_{n + 1}The constant Ti is T_{n + 1}The process proceeds to step S9.
[0188]
In step S9, an inter-image distance determination process corresponding to the distances Df and Db is performed, and the process ends.
[0189]
Next, the flowchart of FIG. 11 shows the details of the inter-image distance determination process in step S9 of FIG. In FIG. 11, it is assumed that one or two B pictures are arranged between I or P pictures.
[0190]
In the inter-image distance determination process, first, in step S11, it is determined whether Df is 1 and Db is 2. When it is determined in step S11 that Df is 1 and Db is 2, the process proceeds to step S12, and inter-coding is selected as described below with reference to FIG.
[0191]
That is, in step S12, it is determined whether Eb is larger than q × Ef + (1−p × q) × Ti and larger than p × Ef. If it is determined in step S12 that Eb is greater than q × Ef + (1−p × q) × Ti and greater than p × Ef, the process proceeds to step S13, and forward prediction encoding is selected and the process returns. . If it is determined in step S12 that Eb is not greater than q × Ef + (1−p × q) × Ti or greater than p × Ef, the process proceeds to step S14, where Eb is r × Ef +. It is determined whether it is less than (1−p × r) × Ti and less than p × Ef.
[0192]
If it is determined in step S14 that Eb is less than r × Ef + (1−p × r) × Ti and less than p × Ef, the process proceeds to step S15, and reverse prediction encoding is selected, and return is performed. To do. If it is determined in step S14 that Eb is not less than r × Ef + (1−p × r) × Ti or less than p × Ef, the process proceeds to step S16, and bidirectional predictive coding is selected. To return.
[0193]
On the other hand, if it is determined in step S11 that Df is not 1 or Db is not 2, the process proceeds to step S17, where it is determined whether Df is 2 and Db is 1.
[0194]
When it is determined in step S17 that Df is 2 and Db is 1, the process proceeds to step S18, and inter-coding is selected as described below with reference to FIG.
[0195]
That is, in step S18, it is determined whether Eb is larger than t × Ef + (1−s × t) × Ti and larger than s × Ef. If it is determined in step S18 that Eb is greater than t × Ef + (1−s × t) × Ti and greater than s × Ef, the process proceeds to step S19, and forward prediction encoding is selected and the process returns. . If it is determined in step S18 that Eb is not greater than t × Ef + (1−s × t) × Ti or greater than s × Ef, the process proceeds to step S20, where Eb is u × Ef +. It is determined whether it is less than (1−s × u) × Ti and less than s × Ef.
[0196]
In step S20, when it is determined that Eb is less than u × Ef + (1−s × u) × Ti and less than s × Ef, the process proceeds to step S21, in which reverse prediction encoding is selected, and return To do. If it is determined in step S20 that Eb is not less than u × Ef + (1−s × u) × Ti or less than s × Ef, the process proceeds to step S22, and bidirectional predictive coding is selected. To return.
[0197]
On the other hand, if it is determined in step S17 that Df is not 2 or Db is not 1, the process proceeds to step S23, and one of the inter-coding is selected as described below with reference to FIG. . That is, in steps S23 to S27, the same processing as in steps S2 to S6 of FIG. 10 is performed, thereby selecting inter-coding.
[0198]
As described above, since the prediction mode is determined corresponding to the SMV representing the complexity of the motion of the image, the encoding efficiency can be improved as compared with the conventional case.
[0199]
That is, when the motion of the image is complicated, it is difficult to select the bi-directional predictive coding mode in consideration of the prediction accuracy and the amount of data necessary for transmission of the motion vector, and vice versa. However, since the bidirectional predictive encoding mode is easily selected, efficient encoding can be performed.
[0200]
In addition to the complexity of the motion of the image, as described above, the prediction mode can be determined in accordance with the speed of the motion of the image, or both of them.
[0201]
In the present embodiment, the complexity of the image motion is expressed by the above-described SMV, but may be expressed by other physical quantities.
[0202]
Furthermore, in the present embodiment, the speed of motion of the image is expressed by the magnitude of the motion vector and the sum of the absolute values of the x and y components, but it is also expressed by other physical quantities. Is also possible.
[0203]
Further, in this embodiment, when making it easy to select the bi-directional predictive coding mode, the selection of inter coding is performed under the conditions described in FIG. By selecting inter coding under the same conditions as shown in FIG. 12, it is possible to easily select the bidirectional predictive coding mode. However, in this case, it is desirable to make the constants a and c larger or the constants b and d smaller than in the case of FIG.
[0204]
According to the simulation conducted by the present inventors, q or t in FIG. 4 is smaller than a or c in FIG. 12, and r or u in FIG. 4 is b or d in FIG. 12, respectively. It has been confirmed that the encoding efficiency is improved when the ratio is larger. Furthermore, it has been confirmed that when the prediction errors Eb and Ef are small, the encoding efficiency is improved by not using the bidirectional predictive encoding mode.
[0205]
【The invention's effect】
  As described above, according to the image encoding device and the image encoding method of the present invention, the forward motion vector, which is the motion vector of the image to be encoded, with respect to the past reference image that precedes in time, and the later in time. A backward motion vector that is a motion vector of the encoding target image with respect to the future reference image to be performed, and based on the prediction residual of the encoding target image with respect to each of the past reference image or the future reference image, Predictive image is determined by determining the prediction mode of the target image to be one of the forward prediction coding mode, the backward prediction coding mode, or the bidirectional prediction coding mode, and performing motion compensation corresponding to the prediction mode. The difference value between the image to be encoded and the predicted image is calculated, the difference value is encoded, and the prediction residual for each of the past reference image or the future reference image is expressed as Ef. Or Eb and α, β, γ, and δ are predetermined constants (where α, β, γ, and δ are real numbers and γ <β), the equation Eb> α × Ef and Eb When> β × Ef + (1−α × β) × δ holds, the prediction mode is determined as the forward prediction encoding mode for generating the prediction image only from the past reference image, and the expression Eb ≦ α × Ef and Eb < When γ × Ef + (1−α × γ) × δ holds, the prediction mode is determined as the backward prediction encoding mode for generating the prediction image only from the future reference image, and the expression γ × Ef + (1−α × γ ) × δ ≦ Eb and Eb ≦ β × Ef + (1−α × β) × δ, the prediction mode is set to the bidirectional predictive coding mode for generating the prediction image from both the past reference image and the future reference image. DecisionHowever, by increasing δ as the size of the forward motion vector or backward motion vector increases, the bi-directional predictive coding mode is less likely to be determined.I did it. Therefore, efficient encoding can be performed based on the motion of the image.
[Brief description of the drawings]
FIG. 1 is a diagram showing a GOP.
FIG. 2 is a diagram showing a GOP.
FIG. 3 is a diagram for explaining that a prediction residual of a B picture varies depending on a distance from an I or P picture.
FIG. 4 is a diagram for explaining a condition for selecting a prediction mode.
FIG. 5 is a diagram illustrating a condition for selecting a prediction mode when the bidirectional predictive coding mode is easily selected.
FIG. 6 is a block diagram illustrating a configuration of an embodiment of an image encoding device to which the present invention has been applied.
FIG. 7 is a block diagram subsequent to FIG. 6;
8 is a diagram for explaining processing of the image encoding device in FIGS. 6 and 7. FIG.
9 is a diagram for explaining processing of the scan converter 5 of FIG. 6; FIG.
10 is a flowchart for explaining processing of a prediction mode determination circuit 21 in FIG. 7;
FIG. 11 is a flowchart for explaining the details of the inter-image distance determination processing in step S9 in FIG.
FIG. 12 is a diagram illustrating a condition for selecting a prediction mode when the bidirectional predictive coding mode is easily selected.
FIG. 13 is a diagram for explaining motion compensation predictive coding.
FIG. 14 is a diagram showing a GOP.
FIG. 15 is a diagram for explaining MPEG encoding;
FIG. 16 is a diagram illustrating a condition for selecting a prediction mode.
FIG. 17 is a diagram showing a GOP.
FIG. 18 is a diagram illustrating a condition for selecting a prediction mode.
[Explanation of symbols]
3 image encoding type designation circuit, 4 image encoding reordering circuit, 5 scan converter, 6 motion vector estimation circuit, 7 storage unit, 7A past reference image storage unit, 7B current image storage unit, 7C future reference image storage unit, 8 motion amount calculation circuit, 9 counter, 10 inter-image distance generation calculation circuit, 11 operation unit, 11A to 11C operation unit, 12 DCT circuit, 13 quantization circuit, 14 transmission buffer, 15 variable length encoding circuit, 16 inverse quantum Circuit, 17 IDCT circuit, 18 arithmetic unit, 19 frame memory, 19A future reference image storage circuit, 19B past reference image storage circuit, 20 motion compensation circuit, 21 prediction mode determination circuit, 31 recording medium, 32 transmission path

Claims

The forward motion vector that is the motion vector of the image to be encoded with respect to the past reference image that precedes in time, and the backward motion that is the motion vector of the image to be encoded with respect to the future reference image that follows in time. Motion vector detecting means for detecting a vector;
Based on the prediction residual of the encoding target image for each of the past reference image or the future reference image, the prediction mode of the encoding target image is set to the forward predictive encoding mode, the reverse predictive encoding mode, or bidirectional. A prediction mode determining means for determining one of the prediction encoding modes;
Motion compensation means for generating a predicted image by performing motion compensation corresponding to the prediction mode;
Difference value calculating means for calculating a difference value between the image to be encoded and the predicted image;
Encoding means for encoding the difference value,
The prediction mode determination means includes
When the prediction residual for each of the past reference image or the future reference image is Ef or Eb and α, β, γ, and δ are predetermined constants (provided that α, β, γ, and δ are real numbers) , Γ <β),
When the equation Eb> α × Ef and Eb> β × Ef + (1−α × β) × δ holds, the prediction mode is determined as a forward prediction encoding mode for generating a prediction image only from the past reference image. ,
When the equation Eb ≦ α × Ef and Eb <γ × Ef + (1−α × γ) × δ holds, the prediction mode is determined as a backward prediction encoding mode for generating a prediction image only from the future reference image. ,
When the expressions γ × Ef + (1−α × γ) × δ ≦ Eb and Eb ≦ β × Ef + (1−α × β) × δ hold, a predicted image is generated from both the past reference image and the future reference image. Determining the prediction mode to the bi-directional predictive coding mode ;
The image coding apparatus according to claim 1, wherein the bidirectional predictive coding mode is hardly determined by increasing the δ as the size of the forward motion vector or the backward motion vector increases .

The forward motion vector that is the motion vector of the image to be encoded with respect to the past reference image that precedes in time, and the backward motion that is the motion vector of the image to be encoded with respect to the future reference image that follows in time. Detecting a vector, and for each of the past reference image or the future reference image, the prediction mode of the encoding target image is determined based on the prediction residual of the encoding target image. Or either bi-predictive coding mode,
By performing motion compensation corresponding to the prediction mode, a prediction image is generated,
A difference value between the image to be encoded and the predicted image is calculated,
Encoding the difference value;
When the prediction residual for each of the past reference image or the future reference image is Ef or Eb and α, β, γ, and δ are predetermined constants (provided that α, β, γ, and δ are real numbers) , Γ <β),
When the equation Eb> α × Ef and Eb> β × Ef + (1−α × β) × δ holds, the prediction mode is determined as a forward prediction encoding mode for generating a prediction image only from the past reference image. ,
When the equation Eb ≦ α × Ef and Eb <γ × Ef + (1−α × γ) × δ holds, the prediction mode is determined as a backward prediction encoding mode for generating a prediction image only from the future reference image. ,
When the expressions γ × Ef + (1−α × γ) × δ ≦ Eb and Eb ≦ β × Ef + (1−α × β) × δ hold, a predicted image is generated from both the past reference image and the future reference image. Determining the prediction mode to the bi-directional predictive coding mode ;
The forward movement of the the larger the magnitude of the vector or a backward motion vector δ By greatly, image coding method, characterized in that the bidirectional predictive coding mode is difficult to be determined.