JP4089104B2

JP4089104B2 - Image conversion apparatus and method, learning apparatus and method, and recording medium

Info

Publication number: JP4089104B2
Application number: JP28447599A
Authority: JP
Inventors: 哲二郎近藤; 小林　　直樹; 健治高橋; 義教渡邊
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1998-10-05
Filing date: 1999-10-05
Publication date: 2008-05-28
Anticipated expiration: 2019-10-05
Also published as: JP2000316085A

Description

【０００１】
【発明の属する技術分野】
この発明は、画像変換装置および方法、学習装置および方法、並びに記録媒体に関し、特に、原画像とほぼ同一の画像を復元できる圧縮画像を生成する画像変換装置および方法、学習装置および方法、並びに記録媒体に関する。
【０００２】
【従来の技術】
本願発明者によって、特開平１０−９３９８０号公報に開示されているように、低解像度の画像を用いて、高解像度の画像を生成する技術が提案されており、高解像度の原画像を縮小した低解像度画像を用いて原画像とほぼ同一の高解像度画像を復元することができるとされている。この提案においては、例えば図１に示すように、低解像度画像（上位階層画像）の注目画素Ｉに対応する位置の高解像度画像（復元画像）の画素ｉを中心とする３×３個の画素ａ乃至ｉの画素値を、その近傍に位置する低解像度画像の複数の画素（例えば、３×３個の画素Ａ乃至Ｉ）と所定の予測係数との線形１次結合等を演算することにより求めている。さらに、その復元画像の画素値と原画像の画素値との誤差を演算し、その結果に対応して低解像度画像の画素値、および予測係数の更新を繰り返している。
【０００３】
【発明が解決しようとする課題】
ところで、上述した従来における低解像度画像の画素値の更新は、１画素毎に近傍の画素の画素値を固定した条件の下で行われていた。すなわち、図１に示すように、低解像度画像の注目画素Ｉの画素値は、注目画素Ｉを中心とする８個の画素ＡないしＨの画素値、および所定の予測係数の値を固定した条件下において最適な値に更新されていた。
【０００４】
したがって、画素Ｉの画素値を更新した後、画素Ｄの画素値を更新した場合、画素Ｄは、画素Ｉの画素値を更新したときに画素値が固定された画素であるので、先ほど更新された画素Ｉの画素値は、更新された画素Ｄに対しては最適ではない。よって、低解像度画像（上位階層画像）の画素値を１画素毎に順次更新した場合、最終的に全画素値が更新された低解像度画像（上位階層画像）は、必ずしも原画像を復元できる最適なものであるとはいえない問題があった。
【０００５】
この問題は、低解像度画像（上位階層画像）の隣接する複数の画素の画素値を、同時に最適な値に更新すれば解決できるが、その演算量は膨大であって、演算に長い時間がかかるとともに、演算回路の規模が大きくなり、実質的には不可能であった。
【０００６】
この発明はこのような状況に鑑みてなされたものであり、隣接する複数の画素の画素値を同時に更新することにより、原画像とほぼ同一の高解像度画像を復元することが可能な低解像度画像を短時間で得ることができるようにするものである。
【０００７】
【課題を解決するための手段】
上述した課題を達成するために、請求項１の発明は、第１の画像データを、第１の画像データより低質な第２の画像データに変換する画像データ変換装置において、
第１の画像データから、第２の画像データとほぼ同質な中間画像データを生成する中間画像データ生成部と、
中間画像データを記憶する記憶部と、
中間画像データから、一画面中の一部であるブロック毎の複数の画素を抽出するブロック抽出部と、
生成され、または予め取得された予測係数を出力する予測係数生成部と、
予測係数と中間画像データと第１の画像データとに基づいてブロック抽出部で抽出された中間画像データの複数の画素の画素値を一括して更新する画素値更新部と、
画素値更新部で画素値が更新された中間画像データと予測係数とに基づいて、第１の画像データとほぼ同質な予測画像データを生成する予測画像データ生成部と、
第１の画像データと予測画像データの誤差を検出する誤差検出部と、
誤差に基づいて、中間画像データを出力画像とするか否かを決定する制御部とを備えることを特徴とする画像データ変換装置である。
【０００８】
請求項６の発明は、第１の画像データを、第１の画像データより低質な第２の画像データに変換する画像データ変換方法において、
第１の画像データから第２の画像データとほぼ同質な中間画像データを生成するステップと、
中間画像データから、一画面中の一部であるブロック毎の複数の画素を抽出するステップと、
生成され、または予め取得された予測係数を出力するステップと、予測係数と中間画像データと第１の画像データとに基づいてブロック抽出部で抽出された中間画像データの複数の画素の画素値を一括して更新するステップと、
画素値が更新された中間画像データと予測係数とに基づいて、第１の画像データとほぼ同質な予測画像データを生成するステップと、
第１の画像データと予測画像データの誤差を検出するステップと、
誤差に基づいて、中間画像データを出力画像とするか否かを決定するステップとを有することを特徴とする画像データ変換方法である。
【０００９】
請求項７の発明は、第１の画像データを、第１の画像データより低質な第２の画像データに変換する時に、第２の画像データの画素値を学習する学習装置において、
第１の画像データから、第２の画像データとほぼ同質な中間画像データを生成する中間画像データ生成部と、
中間画像データを記憶する記憶部と、
中間画像データから、一画面中の一部であるブロック毎の複数の画素を抽出するブロック抽出部と、
生成され、または予め取得された予測係数を出力する予測係数生成部と、
予測係数と中間画像データと第１の画像データとに基づいてブロック抽出部で抽出された中間画像データの複数の画素の画素値を一括して更新する画素値更新部と、
画素値更新部で画素値が更新された中間画像データと予測係数とに基づいて、第１の画像データとほぼ同質な予測画像データを生成する予測画像データ生成部と、
第１の画像データと予測画像データの誤差を検出する誤差検出部と、
誤差に基づいて、中間画像データを出力画像とするか否かを決定する制御部とを備え、
画素値更新部は、予測係数を生徒データとし、対応する第１の画像データを教師データとして、最小二乗法によって、中間画像データの画素値を更新することを特徴とする学習装置である。
【００１０】
請求項８の発明は、第１の画像データを、第１の画像データより低質な第２の画像データに変換する時に、第２の画像データの画素値を学習する学習方法において、
第１の画像データから第２の画像データとほぼ同質な中間画像データを生成するステップと、
中間画像データから、一画面中の一部であるブロック毎の複数の画素を抽出するステップと、
生成され、または予め取得された予測係数を出力するステップと、
予測係数と中間画像データと第１の画像データとに基づいて抽出された中間画像データの複数の画素の画素値を一括して更新するステップと、
画素値が更新された中間画像データと予測係数とに基づいて、第１の画像データとほぼ同質な予測画像データを生成するステップと、
第１の画像データと予測画像データの誤差を検出するステップと、
誤差に基づいて、中間画像データを出力画像とするか否かを決定するステップとを備え、
画素値を更新するステップは、予測係数を生徒データとし、対応する第１の画像データを教師データとして、最小二乗法によって、中間画像データの画素値を更新することを特徴とする学習方法である。
【００１１】
請求項９の発明は、第１の画像データを、第１の画像データより低質な第２の画像データに変換する画像データ変換するためのコンピュータ制御可能なプログラムが記録された記録媒体において、
プログラムは、
第１の画像データから第２の画像データとほぼ同質な中間画像データを生成するステップと、
中間画像データから、一画面中の一部であるブロック毎の複数の画素を抽出するステップと、
抽出された中間画像と抽出された中間画像データに対応する位置の第１の画像データとに基づいて予測係数を生成するステップと、
予測係数と中間画像データと第１の画像データとに基づいて抽出された中間画像データの複数の画素の画素値を一括して更新するステップと、
画素値が更新された中間画像データと予測係数とに基づいて、第１の画像データとほぼ同質な予測画像データを生成するステップと、
第１の画像データと予測画像データの誤差を検出するステップと、
誤差に基づいて、中間画像データを出力画像とするか否かを決定するステップとからなることを特徴とする記録媒体である。
【００１２】
この発明では、予測係数演算部は、予測タップを生徒データとし、対応する原画像の画素を教師データとして観測方程式を生成し、観測方程式を解くことによって予測係数を生成する。画素値更新部は、予測係数演算部からの予測係数を生徒データとし、対応する原画像データを教師データとして観測方程式を作成する。この観測方程式を解くことによって、与えられた係数に対する、複数の更新画素値の最適な値を同時に求めることができる。
【００１３】
【発明の実施の形態】
以下、この発明の一実施形態について説明する。図２は、この発明を適用した画像処理装置の一実施の形態の構成を示している。
【００１４】
送信装置１０１には、ディジタル化された画像データが供給されるようになされている。送信装置１０１は、入力された画像データの複数画素毎の平均値を形成し、複数画素を平均値に置き換えることによって、データ量を圧縮し、その結果得られる符号化データを光ディスク、磁気テープ等の記録媒体１０２に記録し、または放送回線（衛星放送等）、電話回線、インターネット等の伝送路１０３を介して伝送する。
【００１５】
受信装置１０４は、記録媒体１０２に記録されている符号化データを再生し、または伝送路１０３を介して伝送される符号化データを受信し、符号化データを復号する。すなわち、間引かれた画素の値が復元される。受信装置１０４から得られる復号画像がディスプレイ（図示せず）に供給され、ディスプレイ上に表示される。
【００１６】
図３は、送信装置１０１の一例を示す。Ｉ／Ｆ（インターフェース）１１１は、外部から供給される画像データの受信処理と、送信機／記録装置１１６に対する符号化データの送信処理を行う。ＲＯＭ１１２は、ＩＰＬ(Initial Program Loading) 用のプログラム等を記憶している。ＲＡＭ１１３は、外部記憶装置１１５に記録されているシステムプログラム（ＯＳ(Operating System)）やアプリケーションプログラムを記憶したり、また、ＣＰＵ１１４の動作に必要なデータを記憶する。
【００１７】
ＣＰＵ１１４は、ＲＯＭ１１２に記憶されているＩＰＬプログラムに従って外部記憶装置１１５からシステムプログラムおよびアプリケーションプログラムをＲＡＭ１１３に展開し、そのシステムプログラムの制御の下でアプリケーションプログラムを実行する。すなわち、インターフェース１１１から供給される画像データに対して、後述するような符号化処理を行う。
【００１８】
外部記憶装置１１５は、例えばハードディスクであって、システムプログラム、アプリケーションプログラム、データを記憶する。送信機／記憶装置１１６は、インターフェース１１１から供給される符号化データを記録媒体１０２に記録し、または伝送路１０３を介して伝送する。インターフェース１１１、ＲＯＭ１１２、ＲＡＭ１１３、ＣＰＵ１１４、および外部記憶装置１１５は、バスを介して互いに接続されている。
【００１９】
上述した構成を有する送信装置１０１においては、インターフェース１１１に画像データが供給されると、その画像データがＣＰＵ１１４に供給される。ＣＰＵ１１４は、画像データを符号化し、その結果得られる符号化データをインターフェース１１１に供給する。インターフェース１１１が符号化データを送信機／記録装置１１６を介して記録媒体１０２に記録し、または伝送路１０３に送出する。
【００２０】
図４は、送信機／記録装置１１６以外の図３の送信装置１０１、すなわち、エンコーダの機能的な構成を示すものである。エンコーダは、ハードウエア、ソフトウェアまたは両者の組み合わせで実現することが可能である。例えば後述するフローチャートに示されるようなエンコード処理のプログラムが格納された記録媒体をドライブに装着することによって、このプログラムを外部記憶装置１１５にインストールし、エンコーダとしての機能を実現できる。
【００２１】
なお、上述したような処理を行うコンピュータプログラムをユーザに提供する記録媒体としては、磁気ディスク、ＣＤ−ＲＯＭ、固体メモリなどの記録媒体の他、ネットワーク、衛星などの通信媒体を利用することができる。
【００２２】
図４に示すエンコーダにおいては、入力される原画像データが画像データが画像縮小回路１、予測係数演算回路４、画素値更新回路５、および誤差演算回路７に供給されるようになされている。画像縮小回路１は、供給された原画像（高解像度画像）を、例えば図１に示したように、３×３画素からなるブロックに分割し、各ブロック内の９画素の画素値の平均値をブロックの中心に位置する上位階層画像（低解像度画像）の画素の画素値として初期上位階層画像を生成し、上位階層画像メモリ２に出力するようになされている。したがって、上位階層画像（以下、上位画像と記述する）は、原画像の縦および横のサイズが１／３に縮小されたものとなる。
【００２３】
なお、初期上位画像を形成する場合、平均値以外に、各ブロックの中心に位置する画素の値、各ブロックの複数の画素値の中間値、各ブロックの複数の画素値の最も多い値、間引きで形成された画素等を用いても良い。
【００２４】
上位階層画像メモリ２は、画像縮小回路１から入力された上位画像を記憶するようになされている。また上位階層画像メモリ２は、画素値更新回路５から入力される画素値を用いて、記憶している上位画像の画素値を更新するようになされている。さらに、上位階層画像メモリ２は、記憶している上位画像データをスイッチ８を介してフレームメモリ９に出力するようになされている。
【００２５】
予測タップ取得回路３は、上位階層画像メモリ２に記憶されている上位画像の画素を、順次、注目画素に決定し、注目画素とその近傍の画素の画素値を予測係数演算回路４、画素値更新回路５、およびマッピング回路６に供給する。
【００２６】
簡単のために、図５に示すように、更新画素値ブロック（予測タップ）のサイズが３×３、予測係数タップのサイズが３×３の場合について説明する。なお、更新画素値ブロック（予測タップ）は、予測のために抽出される複数の画素で構成されるブロックを意味する。また、予測係数タップは、予測に使用される複数の係数群を意味する。図５中で、ｘは、更新する画素を示し、Ｘは、画素値が固定されている画素を示す。また、後述するｃが係数を意味し、Ｙ’が予測値を示し、Ｙが原画像の画素値を示す。
【００２７】
例えば、図５に示す画素ｘ5 が注目画素に決定された場合、予測係数演算回路４には、注目画素ｘ5 を中心とする３×３画素（画素ｘ1 乃至ｘ9 ）からなる予測タップが供給される。画素値更新回路５には、注目画素ｘ5 を中心とする３×３の９個の画素のいずれかが３×３画素内に含まれる全ての予測タップ（図５において破線に囲まれた、注目画素ｘ5 を中心とする７×７画素）が供給される。マッピング回路６には、注目画素ｘ5 を中心とする７×７画素から、注目画素ｘ5 を中心とする３×３画素を除いた４０（＝４９−９）個の画素（画素Ｘ1 乃至Ｘ40）が供給される。
【００２８】
予測係数演算回路４は、予測タップ取得回路３から供給された注目画素ｘ5 を中心とする予測タップ（画素ｘ1 乃至ｘ9 ）を学習データ（生徒データ）とし、対応する原画像の画素を教師データとして観測方程式を生成して、最小自乗法により観測方程式を解くことにより、図６に示すような、９モード分（モード１乃至モード９）の予測係数を演算して、画素値更新回路５、およびマッピング回路６に供給するようになされている。
【００２９】
なお、３×３個の予測係数からなる各モードの予測係数タップは、注目画素に対応する位置の下位階層画像（以下、下位画像と記述する）の画素を中心とする３×３画素のそれぞれの画素値を予測するときに用いられる。
【００３０】
より具体的には、図１の画素の配列において、画素ａを予測する時に使用される予測係数タップがモード１の予測係数タップであり、画素ｂを予測する時に使用される予測係数タップがモード２の予測係数タップであり、画素ｃを予測する時に使用される予測係数タップがモード３の予測係数タップであり、画素ｈを予測する時に使用される予測係数タップがモード４の予測係数タップであり、画素ｉを予測する時に使用される予測係数タップがモード５の予測係数タップであり、画素ｄを予測する時に使用される予測係数タップがモード６の予測係数タップであり、画素ｇを予測する時に使用される予測係数タップがモード７の予測係数タップであり、画素ｆを予測する時に使用される予測係数タップがモード８の予測係数タップであり、画素ｅを予測する時に使用される予測係数タップがモード９の予測係数タップである。
【００３１】
図７Ａは、図５に示した上位画像から予測される下位画像を示している。例えば、モード１の予測係数タップ（予測係数ｃ11乃至ｃ19）と、画素ｘ3 を中心とする予測タップを構成する画素（画素Ｘ11，Ｘ12，Ｘ13，ｘ2 ，ｘ3 ，Ｘ17，ｘ5 ，ｘ6 ，Ｘ21）の画素値との線形１次結合により、注目画素に対応する位置の下位画像の画素（図７Ｂの画素Ｙ5 ’）の左上に隣接する画素（図７Ｂの画素Ｙ1 ’）の画素値が演算される。また、モード９の予測係数タップ（予測係数ｃ91乃至ｃ99）と、画素ｘ3 を中心とする予測タップを構成する画素の画素値との線形１次結合により、注目画素に対応する位置の下位画像の画素の右下に隣接する画素（図７Ｂの画素Ｙ9 ’）の画素値が演算される。
【００３２】
画素値更新回路５は、注目画素を中心とする３×３画素の画素値を同時に更新し、更新した画素値を上位階層画像メモリ２、およびマッピング回路６に出力するようになされている。
【００３３】
図８は、画素値更新回路５の詳細な構成例を示している。正規方程式生成回路１１は、予測係数演算回路４から入力された予測係数、予測タップ取得回路３から入力された予測タップを構成する画素の画素値、原画像の対応する画素値を用いて、予測値と真値（原画像の画素値）より正規方程式を生成し、画素値決定回路１２に出力するようになされている。画素値決定回路１２は、入力された正規方程式によって予測値と真値との誤差を最小とする、上位画像の注目画素を中心とする３×３画素の画素値（更新値）を同時に演算するようになされている。以下においては、同時に更新される９個の画素を更新画素値タップと記述する。
【００３４】
ここで、生成される正規方程式について説明する。正規方程式は、更新画素値タップと予測係数タップが部分的に重なる範囲の画素値を用いて生成される。例えば、図５に示した画素ｘ5 を中心とする３×３画素（画素ｘ1 乃至ｘ9 ）の画素値を更新する（更新画素値タップとする）場合、破線で囲まれた領域中の更新画素値タップ以外の画素（画素Ｘ1 乃至Ｘ40）の画素値、および全ての予測係数ｃ1 1 乃至ｃ99を固定し、予測係数タップを、破線で囲まれた領域内で移動して下位画像の画素値を予測する。
【００３５】
例えば、予測係数タップの中心が画素ｘ3 と重なる位置に移動されたときにおいては、モード１の予測係数タップ（予測係数ｃ11乃至ｃ19）と、画素ｘ3 を中心とする３×３画素（画素Ｘ11，Ｘ12，Ｘ13，ｘ2 ，ｘ3 ，Ｘ17，ｘ5 ，ｘ6 ，Ｘ21）との画素値の線形１次結合で、画素ｘ3 の位置に対応する下位画像の画素Ｙ5 ’の左上の画素Ｙ1 ’の画素値（予測値）が演算される。この画素値Ｙ1'は、次式（１）で表すことができる。
【００３６】
Ｙ1' =ｃ11Ｘ11 +ｃ12Ｘ12 +ｃ13Ｘ13 +ｃ14ｘ2+ｃ15ｘ3+ｃ16Ｘ17 +ｃ17ｘ5+ｃ18ｘ6+ｃ19Ｘ21・・・（１）
同様に、画素値Ｙ2 ’乃至Ｙ9 ’も、予測係数と上位画像の画素値との線形１次結合で表し、得られた９本の式を行列を用いて書き換えれば、次式のような観測方程式が成立する。
【００３７】
Ｙ'=ｃＸ
ただし、Ｙ’は画素値Ｙ1 ’乃至Ｙ9 ’の集合で成る行列であり、ｃは予測係数ｃ11乃至ｃ99の集合で成る行列であり、Ｘは上位画像の画素値の集合で成る行列である。
【００３８】
次に、この観測方程式に最小二乗法を適用して、原画像の画素値に近い予測値Ｙ’を求めることを考える。
【００３９】
ここで、再び観測方程式の元となった式（１）に注目すれば、予測値Ｙ1 ’と、対応する原画像の画素値Ｙ1 との差は、次式（２）に示すようになる。
【００４０】
Ｙ1-Ｙ1' =Ｙ1 -(ｃ11Ｘ11 +ｃ12Ｘ12 +ｃ13Ｘ13 +ｃ14ｘ2 +ｃ15ｘ3+ｃ16Ｘ17 +ｃ17ｘ5+ｃ18ｘ6+ｃ19Ｘ21）・・・（２）
式（２）の右辺を整理すれば、次式（３）を得る。
【００４１】
Ｙ1-Ｙ1' =Ｙ1 -(ｃ11Ｘ11 +ｃ12Ｘ12 +ｃ13Ｘ13 +ｃ16Ｘ17 +ｃ19Ｘ21) -(ｃ14ｘ2+ｃ15ｘ3+ｃ17ｘ5+ｃ18ｘ6)・・・（３）
予測値Ｙ1 ’と、対応する原画像の画素値Ｙ1 との差、すなわち式（３）の左辺を残差とし、右辺の定数項を左辺に移項して整理すれば、次式（４）を得る。
【００４２】
Ｙ1 -(ｃ11Ｘ11 +ｃ12Ｘ12 +ｃ13Ｘ13 +ｃ16Ｘ17 +ｃ19Ｘ21)+ｅ1 = (ｃ14ｘ2+ｃ15ｘ3+ｃ17ｘ5+ｃ18ｘ6)・・・（４）
さらに、予測係数タップの他のモード（モード２乃至モード９）を用いて、Ｙn −Ｙn ’（ｎは２乃至９）からも、式（４）と同様の次式（５）乃至（１２）を得る。
【００４３】
Ｙ2 -(ｃ21Ｘ11 +ｃ22Ｘ12 +ｃ23Ｘ13 +ｃ26Ｘ17 +ｃ29Ｘ21)+ｅ2 = (ｃ24ｘ2+ｃ25ｘ3+ｃ27ｘ5+ｃ28ｘ6)・・・（５）
Ｙ3 -(ｃ31Ｘ11 +ｃ32Ｘ12 +ｃ33Ｘ13 +ｃ36Ｘ17 +ｃ39Ｘ21)+ｅ3 = (ｃ34ｘ2+ｃ35ｘ3+ｃ37ｘ5+ｃ38ｘ6)・・・（６）
Ｙ4 -(ｃ41Ｘ11 +ｃ42Ｘ12 +ｃ43Ｘ13 +ｃ46Ｘ17 +ｃ49Ｘ21)+ｅ4 = (ｃ44ｘ2+ｃ45ｘ3+ｃ47ｘ5+ｃ48ｘ6)・・・（７）
Ｙ5 -(ｃ51Ｘ11 +ｃ52Ｘ12 +ｃ53Ｘ13 +ｃ56Ｘ17 +ｃ59Ｘ21)+ｅ5 = (ｃ54ｘ2+ｃ55ｘ3+ｃ57ｘ5+ｃ58ｘ6)・・・（８）
Ｙ6 -(ｃ61Ｘ11 +ｃ62Ｘ12 +ｃ63Ｘ13 +ｃ66Ｘ17 +ｃ69Ｘ21)+ｅ6 = (ｃ64ｘ2+ｃ65ｘ3+ｃ67ｘ5+ｃ68ｘ6)・・・（９）
Ｙ7 -(ｃ71Ｘ11 +ｃ72Ｘ12 +ｃ73Ｘ13 +ｃ76Ｘ17 +ｃ79Ｘ21)+ｅ7 = (ｃ74ｘ2+ｃ75ｘ3+ｃ77ｘ5+ｃ78ｘ6)・・・（１０）
Ｙ8 -(ｃ81Ｘ11 +ｃ82Ｘ12 +ｃ83Ｘ13 +ｃ86Ｘ17 +ｃ89Ｘ21)+ｅ8 = (ｃ84ｘ2+ｃ85ｘ3+ｃ87ｘ5+ｃ88ｘ6)・・・（１１）
Ｙ9 -(ｃ91Ｘ11 +ｃ92Ｘ12 +ｃ93Ｘ13 +ｃ96Ｘ17 +ｃ99Ｘ21)+ｅ9 = (ｃ94ｘ2+ｃ95ｘ3+ｃ97ｘ5+ｃ98ｘ6)・・・（１２）
同様に、予測係数タップの位置を、図５の破線で囲まれた領域内で移動し、すなわち、予測係数タップの中心を、画素Ｘ9 ，Ｘ13，Ｘ28，Ｘ32を頂点とする矩形領域内の全ての画素（２５画素）と順次重なるように移動し、予測係数タップの全てのモードを用いて、２２５（＝９×２５）本の式（４）乃至（１２）と同様な式を得る。
【００４４】
この２２５本の式を行列で表せば、次式（１３）に示すような、［教師データ］＋［残差ｅ］＝［学習データｃ］×［予測画素値ｘ］の形をした残差方程式となる。
【００４５】
【数１】

【００４６】
ただし、式（１３）は、表記を簡略化するために、式（４）乃至（１２）に対応する部分だけを示している。また、ａij（ｉ＝１，２，・・・，ｍ（＝２２５）、ｊ＝１，２，・・・，９）は、行列［学習データｃ］のｉ行ｊ列に存在するデータに等しい。
【００４７】
この場合、原画像の画素値Ｙに近い予測値Ｙ’を求めるための予測画素値ｘi は、下記の自乗誤差を最小にすることで求めることができる。
【００４８】
【数２】

【００４９】
したがって、上述した自乗誤差を予測画素値ｘi で微分したものが０になる場合、すなわち、次式を満たす予測画素値ｘi が、原画像の画素値Ｙに近い予測値Ｙ’を求めるための最適値となる。
【００５０】
【数３】

【００５１】
そこで、まず、式（１３）を予測画素値ｘi で微分することにより、次式が成立する。
【００５２】
【数４】

【００５３】
式（１４）および式（１６）より、式（１７）が得られる。
【００５４】
【数５】

【００５５】
さらに、式（１３）の教師データ（原画像の画素値Ｙ−定数項）をＹ’’とし、教師データＹ’’、予測係数ｃ、予測画素値ｘ、および残差ｅの関係を考慮すると、式（１７）から、次式（１８）のような正規方程式を得ることができる。
【００５６】
【数６】

【００５７】
得られた正規方程式を、例えば、掃き出し法（Gauss-Jordanの消去法）などを適用して解くことにより、予測係数演算回路４から供給された予測係数タップに対応する、更新画素値タップの最適な画素値を求めることができる。
【００５８】
図４の説明に戻る。マッピング回路６は、画素値更新回路５から供給された注目画素を中心とする更新画素値タップの９個の画素の画素値、予測タップ取得回路３から供給された、注目画素を中心とする７×７画素から注目画素を中心とする３×３画素を除いた４０個の画素の画素値、および予測係数演算回路４から入力された９モード分の予測係数タップの予測係数を線形１次結合することにより、下位画像の画素値を部分的に（更新画素値タップの画素が影響する範囲を）ローカルデコードする。ローカルデコードされた下位画像の画素値は誤差演算回路７に供給されるようになされている。
【００５９】
誤差演算回路７は、マッピング回路６からのローカルデコードされた下位画像の画素値と、原画像の対応する画素値との誤差を演算する。以下の説明では、誤差としてS/N を用いる。S/N ＝20log ₁₀(255/err)(err:誤差の標準偏差）の関係にある。S/N が閾値以上である場合には、最適な画素が生成されたと判断して、スイッチ８をオンに制御するようになされている。この場合、部分的にローカルデコードした画像でS/N を評価する代えて、画像全体にわたってS/N を評価するようにしても良い。
【００６０】
フレームメモリ９は、上位階層画像メモリ２からスイッチ８を介して入力される、部分的に最適化された上位画像を、入力される度に更新して記憶するようになされている。したがって、上位階層画像メモリ２に記憶されている上位画像の全ての画素が注目画素とされた後、フレームメモリ９には、全ての画素が最適化された最適上位画像が記憶されるようになされている。
【００６１】
フレームメモリ９に記憶された最適上位画像は、９モード分の予測係数タップとともに所定のタイミングでデコーダ（図１３を参照して後述する）に出力されるようになされている。以下に説明するエンコーダの処理を制御するために、制御部１０が設けられている。制御部１０は、誤差演算回路７の出力を受け取り、スイッチ８を制御する信号を発生する。また、エンコーダの処理を行うために、各ブロックに対して種々の制御信号を供給する。
【００６２】
次に、このエンコーダの最適上位画素値生成処理の概略について、図９のフローチャートを参照して説明する。以下に説明する処理は、図４の構成と対応して説明されている。しかしながら、図４の構成を有するハードウエアにより実現されるものに限らず、外部からインストールされ、または図３中のＲＯＭ１１２に格納されているソフトウェアプログラムに従ってＣＰＵ１１４が行うようにしても良い。その場合には、各ステップの処理は、ソフトウェアプログラムに従ってＣＰＵ１１４の制御の下でなされる。
【００６３】
ステップＳ１において、画像縮小回路１は、供給された原画像（高解像度画像）を、３×３画素からなるブロックに分割し、各ブロック内の９画素の画素値の平均値をブロックの中心に位置する上位画像（低解像度画像）の画素の画素値として初期上位画像を生成し、上位階層画像メモリ２に記憶させる。
【００６４】
予測タップ取得回路３は、上位階層画像メモリ２に記憶されている上位画像の画素を、順次、注目画素に決定して、上位階層画像メモリ２から注目画素を中心とする７×７画素の画素値を取得する。取得した４９個の画素値のうちの注目画素を中心とする３×３画素の画素値は、予測係数演算回路４に供給される。また、画素値更新回路５には、取得した全ての画素値が供給される。さらに、取得した４９個の画素値のうちの注目画素を中心とする３×３画素を除いた４０（＝４９−９）個の画素の画素値がマッピング回路６に供給される。例えば、図５に示す画素ｘ5 が注目画素に決定された場合、注目画素ｘ5 を中心とする３×３画素（画素ｘ1 乃至ｘ9 ）の予測係数タップの画素値は、予測係数演算回路４に供給され、注目画素ｘ5 を中心とする７×７画素の画素値は、画素値更新回路５に供給され、注目画素ｘ5 を中心とする７×７画素から注目画素ｘ5 を中心とする３×３画素を除いた４０（＝４９−９）画素の画素値は、マッピング回路６に供給される。
【００６５】
ステップＳ２において、予測係数演算回路４は、予測タップ取得回路３から供給された注目画素を中心とする３×３画素の予測タップを学習データ（生徒データ）とし、対応する原画像の画素を教師データとして観測方程式を生成し、最小自乗法を適用して解くことにより９モード分の予測係数タップを求め、画素値更新回路５、およびマッピング回路６に供給する。予測係数う求める時には、画面中の全画素について方程式をたてるようになされる。
【００６６】
ステップＳ３において、画素値更新回路５の正規方程式生成回路１１は、予測係数演算回路４から入力された予測係数タップ、予測タップ取得回路３から供給された注目画素を中心とする７×７画素の画素値、および原画像の対応する画素値を用いて、式（１３）に示すような観測方程式を生成して、画素値決定回路１２に出力する。画素値決定回路１２は、入力された観測方程式に最小自乗法を適用して解き、得られた更新画素値タップの画素値を上位階層画像メモリ２、およびマッピング回路６に出力する。
【００６７】
上位階層画像メモリ２は、画素値更新回路５から入力された更新画素値タップの画素値を用いて、いままで記憶していた上位画像の対応する画素の画素値を更新する。マッピング回路６は、画素値更新回路５から入力された更新画素値タップの画素値、予測タップ取得回路３から入力された、注目画素を中心とする７×７画素から注目画素を中心とする３×３画素を除いた４０画素の画素値、および予測係数演算回路４から入力された９モード分の予測係数タップとの線形１次結合を演算して、下位画像の画素値を部分的にローカルデコードする。ローカルデコードされた下位画像の画素値は、誤差演算回路７に供給される。
【００６８】
ステップＳ４において、誤差演算回路７は、マッピング回路６からのローカルデコードされた下位画像の画素値と、原画像の対応する画素値とのS/N を演算し、S/N が所定の閾値以上であるか否かを判定する。S/N が所定の閾値以上ではないと判定された場合、ステップＳ２乃至Ｓ４の処理が繰り返される。S/N が所定の閾値以上であると判定された場合、ステップＳ５に進む。
【００６９】
ステップＳ５において、誤差演算回路７の制御によりスイッチ８がオンとされ、上位階層画像メモリ２からスイッチ８を介してフレームメモリ９に部分的に最適化された上位画像が出力される。
【００７０】
この最適上位画素値生成処理を、上位階層画像メモリ２に記憶されている上位画像の全ての画素に対して実行することにより、フレームメモリ９には、全ての画素が最適化された最適上位画像が記憶される。記憶された最適上位画像は、９モード分の予測係数タップとともに所定のタイミングでデコーダに出力される。
【００７１】
図９のエンコーダの処理についての幾つかの例について説明する。図１０のフローチャートに示す第１の方式は、１回の予測係数の更新に対して、各画素１回の更新を行う例である。
【００７２】
ステップＳ２１において、エンコーダは、原画像を縮小処理することによって、上位画像を生成する。そして、エンコーダは、画面全体の全画素の予測係数を更新する（ステップＳ２２）。次のステップＳ２３において、エンコーダは、ブロック（更新画素値タップと同義である）の画素値を更新する。ステップＳ２４では、全ブロックの処理が終了したかどうかが決定され、若し、終了していないならば、ステップＳ２２に戻り、処理が繰り返される。
【００７３】
ステップＳ２４において、全ブロックの画素値の更新が終了したと決定されると、エンコーダは、更新後の上位画像をマッピング（ローカルデコード）して、下位画像との誤差を示すS/N を計算する( ステップＳ２５）。ステップＳ２６では、エンコーダは、S/N が閾値以上かどうかが決定される。S/N が閾値以上であれば、フレームメモリ９に更新後の上位画像を出力し、また、予測係数を出力する（ステップＳ２７）。若し、ステップＳ２６において、S/N が閾値より小であれば、ステップＳ２２に戻り、ステップＳ２２以降の処理を繰り返す。
【００７４】
図１１は、第２の方式を示すフローチャートである。第２の方式は、１回の予測係数の更新に対して、１ブロックのみ画素値を更新するものである。したがって、全ブロックの画素値の更新が終了していないときに、処理がステップＳ２３（画素値の更新）ではなく、ステップＳ２２の全画面の予測係数の更新処理に戻る点のみが、図１０のフローチャートと相違する。
【００７５】
さらに、図１２は、第３の方式を示すフローチャートである。第３の方式では、更新画素値の評価を予測係数の更新後と、画素値の更新後のそれぞれで行うものである。
【００７６】
図１２で、原画像の縮小処理（ステップＳ３１）の後に、全画素の予測係数の更新がなされる（ステップＳ３２）。エンコーダは、更新後の上位画像をマッピングして、下位画像との誤差であるS/N を計算する( ステップＳ３３）。ステップＳ３４では、S/N が閾値以上かどうかが決定される。S/N が閾値以上であれば、エンコーダは、フレームメモリ９に更新後の上位画像を出力し、また、予測係数を出力する（ステップＳ３５）。
【００７７】
若し、ステップＳ３４において、S/N が閾値より小であれば、ステップＳ３６に移り、ステップＳ３６において、エンコーダは、ブロックの画素値を更新する。ステップＳ３７では、全ブロックの処理が終了したかどうかが決定され、若し、終了していないならば、ステップＳ３６に戻り、処理が繰り返される。
【００７８】
ステップＳ３７において、全ブロックの画素値の更新が終了したと決定されると、エンコーダは、更新後の上位画像をマッピングして、下位画像との誤差であるS/N を計算する( ステップＳ３８）。ステップＳ３９では、S/N が閾値以上かどうかが決定される。S/N が閾値以上であれば、エンコーダは、フレームメモリ９に更新後の上位画像を出力し、また、予測係数を出力する（ステップＳ３５）。若し、ステップＳ３９において、S/N が閾値より小であれば、ステップＳ３２に戻り、上述したステップＳ３２以降の処理を繰り返す。
【００７９】
上述したこの発明の一実施形態では、予測係数および上位画像の画素値の両方を最適化するようにしている。しかしながら、この発明においては、予め予測係数を求めておくことによって、画素値のみを最適化することも可能である。この場合、予測係数は、係数決定用のディジタル画像を使用して、エンコーダにおける予測係数生成処理と同様の処理を行うことによって予め生成されている。また、この予測係数は、エンコーダおよびデコーダにおいて共用されるので、記録媒体への記録または伝送が不要である。
【００８０】
次に、エンコーダから出力された最適上位画像から原画像を復元する（下位画像を予測する）デコーダの構成例について、図１３を参照して説明する。このデコーダにおいては、エンコーダから入力された最適上位画像は、最適上位階層画像メモリ２１に記憶され、９モード分の予測係数タップは、マッピング回路２３に供給されるようになされている。
【００８１】
予測タップ取得回路２２は、最適上位階層画像メモリ２１に記憶されている最適上位画像の画素を、順次、注目画素に決定し、最適上位階層画像メモリ２１から注目画素を中心とする３×３画素の予測タップを取得してマッピング回路２３に出力するようになされている。
【００８２】
マッピング回路２３は、予測タップ取得回路２２から入力された予測タップをなす９個の画素の画素値と、エンコーダから供給された９モード分の予測係数タップとの線形１次結合を演算することにより、注目画素の位置に対応する下位画像の画素を中心とする３×３画素の画素値を予測する（原画像の画素を復元する）。予測された下位画像の３×３画素の画素値は、フレームメモリ２４に出力され、記憶されるようになされている。フレームメモリ２４に記憶された下位画像の画素値は、フレーム毎に所定のタイミングで図示せぬディスプレイ等に出力されるようになされている。
【００８３】
このデコーダの原画像復元処理について、図１４にフローチャートを参照して説明する。この原画像復元処理は、エンコーダにより生成された最適上位画像が最適上位階層画像メモリ２１に記憶され、９モード分の予測係数タップが、マッピング回路２３に供給された後、開始される。
【００８４】
ステップＳ１１において、予測タップ取得回路２２は、最適上位階層画像メモリ２１に記憶されている最適上位画像の画素のうちの１個の画素を注目画素に決定する。ステップＳ１２において、予測タップ取得回路２２は、最適上位階層画像メモリ２１から注目画素を中心とする３×３画素の予測タップを取得してマッピング回路２３に出力する。
【００８５】
ステップＳ１３において、マッピング回路２３は、予測タップ取得回路２２から入力された予測タップをなす９個の画素の画素値と、エンコーダから供給された９モード分の予測係数タップとの線形１次結合を演算することにより、注目画素の位置に対応する下位画像の画素を中心とする３×３画素の画素値を予測する（原画像の画素を復元する）。予測された下位画像の３×３画素の画素値は、フレームメモリ２４に出力され、記憶される。
【００８６】
ステップＳ１４において、予測タップ取得回路２２は、最適上位階層画像メモリ２１に記憶されている最適上位画像の全ての画素を注目画素に決定したか否かを判定し、全ての画素を注目画素に決定したと判定するまで、ステップＳ１１乃至Ｓ１４の処理が繰り返される。全ての画素を注目画素に決定したと判定された場合、ステップＳ１５に進む。
【００８７】
ステップＳ１５において、フレームメモリ２４に記憶された下位画像の画素値は、フレーム毎に所定のタイミングで図示せぬディスプレイ等に出力される。
【００８８】
本実施の形態によれば、従来の方法に比べて、復元した画像のS/N が大きい上位画像を得ることができる。
【００８９】
【発明の効果】
以上のように、この発明によれば、複数画素の画素値を同時に、ブロック単位で最適化することができる。それによって、処理を単純化することができ、また、処理時間を短縮できる。
【図面の簡単な説明】
【図１】先に提案したエンコードを説明するための画素の配列を示す略線図である。
【図２】この発明を適用した画像データ変換装置の全体的構成を示すブロック図である。
【図３】図２中の送信装置の機能的構成例を示すブロック図である。
【図４】この発明を適用したエンコーダの構成例を示すブロック図である。
【図５】図４の予測タップ取得回路３の処理を説明する図である。
【図６】予測係数タップを説明する図である。
【図７】下位階層画像を説明する図である。
【図８】図４の画素値更新回路５の構成例を示すブロック図である。
【図９】図４のエンコーダの最適画素値生成処理の概略を説明するフローチャートである。
【図１０】図４のエンコーダの最適画素値生成処理の一例を説明するフローチャートである。
【図１１】図４のエンコーダの最適画素値生成処理の他の例を説明するフローチャートである。
【図１２】図４のエンコーダの最適画素値生成処理のさらに他の例を説明するフローチャートである。
【図１３】図４のエンコーダにより生成された最適上位画像から原画像を復元するデコーダの構成例を示すブロック図である。
【図１４】図１３のデコーダの原画像復元処理を説明するフローチャートである。
【符号の説明】
１・・・画像縮小回路、２・・・上位階層画像メモリ、３・・・予測タップ取得回路、４・・・予測係数演算回路、５・・・画素値更新回路、６・・・マッピング回路、７・・・誤差演算回路、１１・・・正規方程式生成回路、１２・・・画素値決定回路[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image conversion apparatus and method, a learning apparatus and method, and a recording medium, and in particular, an image conversion apparatus and method, a learning apparatus and method, and a recording that generate a compressed image that can restore an image that is almost identical to the original image. It relates to the medium.
[0002]
[Prior art]
As disclosed in Japanese Patent Laid-Open No. 10-93980, the inventor of the present application has proposed a technique for generating a high resolution image using a low resolution image, and reduced the high resolution original image. It is said that a high-resolution image almost identical to the original image can be restored using the low-resolution image. In this proposal, for example, as shown in FIG. 1, 3 × 3 pixels centering on a pixel i of a high-resolution image (restored image) at a position corresponding to a target pixel I of a low-resolution image (upper layer image). By calculating a linear primary combination of a plurality of pixels (for example, 3 × 3 pixels A to I) of a low-resolution image located in the vicinity thereof and a predetermined prediction coefficient, the pixel values a to i Seeking. Further, an error between the pixel value of the restored image and the pixel value of the original image is calculated, and the pixel value of the low-resolution image and the prediction coefficient are repeatedly updated corresponding to the result.
[0003]
[Problems to be solved by the invention]
By the way, the pixel value of the conventional low-resolution image described above has been updated under the condition that the pixel values of neighboring pixels are fixed for each pixel. That is, as shown in FIG. 1, the pixel value of the target pixel I in the low-resolution image is a condition in which the pixel values of the eight pixels A to H centering on the target pixel I and the value of a predetermined prediction coefficient are fixed. It was updated to the optimum value below.
[0004]
Accordingly, when the pixel value of the pixel D is updated after the pixel value of the pixel I is updated, the pixel D is a pixel whose pixel value is fixed when the pixel value of the pixel I is updated. The pixel value of the pixel I is not optimal for the updated pixel D. Therefore, when the pixel values of the low-resolution image (upper layer image) are sequentially updated for each pixel, the low-resolution image (upper layer image) in which all pixel values are finally updated is not necessarily optimal for restoring the original image. There was a problem that could not be said.
[0005]
This problem can be solved by updating the pixel values of a plurality of adjacent pixels of the low resolution image (upper layer image) to the optimum values at the same time, but the amount of calculation is enormous and the calculation takes a long time. At the same time, the scale of the arithmetic circuit has increased, which is practically impossible.
[0006]
The present invention has been made in view of such circumstances, and a low-resolution image that can restore a high-resolution image that is substantially the same as the original image by simultaneously updating pixel values of a plurality of adjacent pixels. Can be obtained in a short time.
[0007]
[Means for Solving the Problems]
In order to achieve the above-described problem, the invention of claim 1 is an image data conversion device that converts first image data into second image data of lower quality than the first image data.
An intermediate image data generating unit that generates intermediate image data substantially the same quality as the second image data from the first image data;
A storage unit for storing intermediate image data;
Multiple images for each block that is part of one screen from the intermediate image data Element A block extractor for extracting;
A prediction coefficient generation unit that outputs a prediction coefficient generated or acquired in advance;
The intermediate image data extracted by the block extraction unit based on the prediction coefficient, the intermediate image data, and the first image data. Multiple pixels Pixel value In bulk A pixel value updater for updating,
A predicted image data generation unit that generates predicted image data substantially the same quality as the first image data based on the intermediate image data and the prediction coefficient whose pixel values have been updated by the pixel value update unit;
An error detection unit for detecting an error between the first image data and the predicted image data;
An image data conversion apparatus comprising: a control unit that determines whether intermediate image data is an output image based on an error.
[0008]
The invention of claim 6 is an image data conversion method for converting first image data into second image data of lower quality than the first image data.
Generating intermediate image data substantially the same quality as the second image data from the first image data;
Multiple images for each block that is part of one screen from the intermediate image data Element Extracting, and
A step of outputting a prediction coefficient generated or acquired in advance; and the intermediate image data extracted by the block extraction unit based on the prediction coefficient, the intermediate image data, and the first image data Multiple pixels Pixel value In bulk A step to update,
Generating predicted image data substantially the same quality as the first image data based on the intermediate image data and the prediction coefficient whose pixel values are updated;
Detecting an error between the first image data and the predicted image data;
And determining whether or not intermediate image data is an output image based on an error.
[0009]
According to a seventh aspect of the present invention, in the learning device that learns the pixel value of the second image data when converting the first image data into the second image data having a lower quality than the first image data.
An intermediate image data generating unit that generates intermediate image data substantially the same quality as the second image data from the first image data;
A storage unit for storing intermediate image data;
Multiple images for each block that is part of one screen from the intermediate image data Element A block extractor for extracting;
A prediction coefficient generation unit that outputs a prediction coefficient generated or acquired in advance;
The intermediate image data extracted by the block extraction unit based on the prediction coefficient, the intermediate image data, and the first image data. Multiple pixels Pixel value In bulk A pixel value updater for updating,
A predicted image data generation unit that generates predicted image data substantially the same quality as the first image data based on the intermediate image data and the prediction coefficient whose pixel values have been updated by the pixel value update unit;
An error detection unit for detecting an error between the first image data and the predicted image data;
A control unit that determines whether or not to use the intermediate image data as an output image based on the error,
The pixel value update unit is a learning device characterized in that the pixel value of the intermediate image data is updated by a least square method using the prediction coefficient as student data and the corresponding first image data as teacher data.
[0010]
The invention according to claim 8 is a learning method for learning a pixel value of second image data when converting the first image data into second image data having a lower quality than the first image data.
Generating intermediate image data substantially the same quality as the second image data from the first image data;
Multiple images for each block that is part of one screen from the intermediate image data Element Extracting, and
Outputting generated or pre-obtained prediction coefficients;
The intermediate image data extracted based on the prediction coefficient, the intermediate image data, and the first image data; Multiple pixels Pixel value In bulk A step to update,
Generating predicted image data substantially the same quality as the first image data based on the intermediate image data and the prediction coefficient whose pixel values are updated;
Detecting an error between the first image data and the predicted image data;
Determining whether to use the intermediate image data as an output image based on the error, and
The step of updating the pixel value is a learning method characterized by updating the pixel value of the intermediate image data by the least square method using the prediction coefficient as the student data and the corresponding first image data as the teacher data. .
[0011]
According to a ninth aspect of the present invention, there is provided a recording medium on which a computer-controllable program for converting image data for converting the first image data into second image data of lower quality than the first image data is recorded.
The program
Generating intermediate image data substantially the same quality as the second image data from the first image data;
Multiple images for each block that is part of one screen from the intermediate image data Element Extracting, and
Generating a prediction coefficient based on the extracted intermediate image and the first image data at a position corresponding to the extracted intermediate image data;
The intermediate image data extracted based on the prediction coefficient, the intermediate image data, and the first image data; Multiple pixels Pixel value In bulk A step to update,
Generating predicted image data substantially the same quality as the first image data based on the intermediate image data and the prediction coefficient whose pixel values are updated;
Detecting an error between the first image data and the predicted image data;
And a step of determining whether or not to output intermediate image data as an output image based on an error.
[0012]
In the present invention, the prediction coefficient calculation unit generates an observation equation by using the prediction tap as student data and the corresponding original image pixels as teacher data, and generates a prediction coefficient by solving the observation equation. The pixel value update unit creates an observation equation using the prediction coefficient from the prediction coefficient calculation unit as student data and corresponding original image data as teacher data. By solving this observation equation, an optimum value of a plurality of updated pixel values for a given coefficient can be obtained simultaneously.
[0013]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described. FIG. 2 shows a configuration of an embodiment of an image processing apparatus to which the present invention is applied.
[0014]
The transmission apparatus 101 is supplied with digitized image data. The transmission apparatus 101 forms an average value for each of a plurality of pixels of input image data, and replaces the plurality of pixels with the average value, thereby compressing the data amount, and the resulting encoded data is converted into an optical disk, a magnetic tape, etc. Or a transmission line 103 such as a broadcast line (satellite broadcast or the like), a telephone line, or the Internet.
[0015]
The receiving device 104 reproduces the encoded data recorded on the recording medium 102 or receives the encoded data transmitted via the transmission path 103, and decodes the encoded data. That is, the thinned pixel values are restored. The decoded image obtained from the receiving device 104 is supplied to a display (not shown) and displayed on the display.
[0016]
FIG. 3 shows an example of the transmission apparatus 101. The I / F (interface) 111 performs reception processing of image data supplied from the outside and transmission processing of encoded data to the transmitter / recording device 116. The ROM 112 stores a program for IPL (Initial Program Loading). The RAM 113 stores system programs (OS (Operating System)) and application programs recorded in the external storage device 115, and stores data necessary for the operation of the CPU 114.
[0017]
The CPU 114 expands the system program and application program from the external storage device 115 to the RAM 113 according to the IPL program stored in the ROM 112, and executes the application program under the control of the system program. That is, encoding processing as described later is performed on the image data supplied from the interface 111.
[0018]
The external storage device 115 is, for example, a hard disk, and stores system programs, application programs, and data. The transmitter / storage device 116 records the encoded data supplied from the interface 111 on the recording medium 102 or transmits it via the transmission path 103. The interface 111, the ROM 112, the RAM 113, the CPU 114, and the external storage device 115 are connected to each other via a bus.
[0019]
In the transmission apparatus 101 having the above-described configuration, when image data is supplied to the interface 111, the image data is supplied to the CPU 114. The CPU 114 encodes the image data and supplies the encoded data obtained as a result to the interface 111. The interface 111 records the encoded data on the recording medium 102 via the transmitter / recording device 116 or sends it to the transmission path 103.
[0020]
FIG. 4 shows a functional configuration of the transmission apparatus 101 of FIG. 3 other than the transmitter / recording apparatus 116, that is, an encoder. The encoder can be realized by hardware, software, or a combination of both. For example, by mounting a recording medium storing an encoding processing program as shown in a flowchart described later in a drive, the program can be installed in the external storage device 115 to realize the function as an encoder.
[0021]
As a recording medium for providing a user with a computer program for performing the processing as described above, a communication medium such as a network or a satellite can be used in addition to a recording medium such as a magnetic disk, a CD-ROM, or a solid-state memory. .
[0022]
In the encoder shown in FIG. 4, input original image data is supplied to the image reduction circuit 1, the prediction coefficient calculation circuit 4, the pixel value update circuit 5, and the error calculation circuit 7. The image reduction circuit 1 divides the supplied original image (high resolution image) into blocks of 3 × 3 pixels, for example, as shown in FIG. 1, and averages the pixel values of 9 pixels in each block. Is generated as a pixel value of a pixel of an upper layer image (low resolution image) located at the center of the block, and is output to the upper layer image memory 2. Therefore, an upper layer image (hereinafter referred to as an upper image) is obtained by reducing the vertical and horizontal sizes of the original image to 1/3.
[0023]
When forming the initial upper image, in addition to the average value, the value of the pixel located at the center of each block, the intermediate value of the plurality of pixel values of each block, the value with the largest number of pixel values of each block, A pixel or the like formed in (1) may be used.
[0024]
The upper layer image memory 2 stores the upper image input from the image reduction circuit 1. Further, the upper layer image memory 2 is configured to update the stored pixel value of the upper image using the pixel value input from the pixel value update circuit 5. Further, the upper layer image memory 2 outputs the stored upper image data to the frame memory 9 via the switch 8.
[0025]
The prediction tap acquisition circuit 3 sequentially determines the pixel of the upper image stored in the upper layer image memory 2 as the target pixel, and determines the pixel value of the target pixel and its neighboring pixels as the prediction coefficient calculation circuit 4 and the pixel value. The update circuit 5 and the mapping circuit 6 are supplied.
[0026]
For simplicity, a case will be described in which the size of the updated pixel value block (prediction tap) is 3 × 3 and the size of the prediction coefficient tap is 3 × 3, as shown in FIG. The updated pixel value block (prediction tap) means a block composed of a plurality of pixels extracted for prediction. The prediction coefficient tap means a plurality of coefficient groups used for prediction. In FIG. 5, x indicates a pixel to be updated, and X indicates a pixel whose pixel value is fixed. Further, c, which will be described later, represents a coefficient, Y ′ represents a predicted value, and Y represents a pixel value of the original image.
[0027]
For example, when the pixel x5 shown in FIG. 5 is determined as the target pixel, the prediction coefficient calculation circuit 4 is supplied with a prediction tap composed of 3 × 3 pixels (pixels x1 to x9) centered on the target pixel x5. . The pixel value update circuit 5 includes all the prediction taps that are included in the 3 × 3 pixels and any of the 9 × 3 pixels centered on the target pixel x5 (the target pixel surrounded by the broken line in FIG. 5). 7 × 7 pixels centered on pixel x5). The mapping circuit 6 has 40 (= 49−9) pixels (pixels X1 to X40) obtained by removing 3 × 3 pixels centered on the target pixel x5 from 7 × 7 pixels centered on the target pixel x5. Supplied.
[0028]
The prediction coefficient calculation circuit 4 uses the prediction tap (pixels x1 to x9) centered on the target pixel x5 supplied from the prediction tap acquisition circuit 3 as learning data (student data), and the corresponding pixels of the original image as teacher data. By generating an observation equation and solving the observation equation by the method of least squares, a prediction coefficient for nine modes (mode 1 to mode 9) is calculated as shown in FIG. The data is supplied to the mapping circuit 6.
[0029]
Note that the prediction coefficient tap of each mode composed of 3 × 3 prediction coefficients is a 3 × 3 pixel centered on a pixel of a lower layer image (hereinafter referred to as a lower image) at a position corresponding to the target pixel. This is used when predicting the pixel value.
[0030]
More specifically, in the pixel arrangement of FIG. 1, the prediction coefficient tap used when predicting the pixel a is the prediction coefficient tap of mode 1, and the prediction coefficient tap used when predicting the pixel b is the mode. 2 prediction coefficient taps, the prediction coefficient tap used when predicting the pixel c is the prediction coefficient tap of mode 3, and the prediction coefficient tap used when predicting the pixel h is the prediction coefficient tap of mode 4. Yes, the prediction coefficient tap used when predicting the pixel i is the prediction coefficient tap of mode 5, the prediction coefficient tap used when predicting the pixel d is the prediction coefficient tap of mode 6, and predicts the pixel g The prediction coefficient tap used when performing prediction is a prediction coefficient tap of mode 7, and the prediction coefficient tap used when predicting pixel f is a prediction coefficient tap of mode 8. Prediction coefficient taps used when predicting the element e is the prediction coefficient tap mode 9.
[0031]
FIG. 7A shows a lower image predicted from the upper image shown in FIG. For example, the prediction coefficient taps of mode 1 (prediction coefficients c11 to c19) and the pixels (pixels X11, X12, X13, x2, x3, X17, x5, x6, X21) constituting the prediction tap centered on the pixel x3. The pixel value of the pixel (pixel Y1 ′ in FIG. 7B) adjacent to the upper left of the pixel in the lower image (pixel Y5 ′ in FIG. 7B) at the position corresponding to the target pixel is calculated by linear primary combination with the pixel value. . In addition, a linear primary combination of the mode 9 prediction coefficient taps (prediction coefficients c91 to c99) and the pixel values of the pixels constituting the prediction tap centered on the pixel x3 results in the lower-level image at the position corresponding to the target pixel. The pixel value of the pixel adjacent to the lower right of the pixel (pixel Y9 'in FIG. 7B) is calculated.
[0032]
The pixel value update circuit 5 is configured to simultaneously update pixel values of 3 × 3 pixels centered on the target pixel and output the updated pixel values to the upper layer image memory 2 and the mapping circuit 6.
[0033]
FIG. 8 shows a detailed configuration example of the pixel value update circuit 5. The normal equation generation circuit 11 performs prediction using the prediction coefficient input from the prediction coefficient calculation circuit 4, the pixel value of the pixel constituting the prediction tap input from the prediction tap acquisition circuit 3, and the corresponding pixel value of the original image. A normal equation is generated from the value and the true value (pixel value of the original image) and output to the pixel value determining circuit 12. The pixel value determination circuit 12 simultaneously calculates a pixel value (update value) of 3 × 3 pixels centering on the target pixel of the upper image, which minimizes the error between the predicted value and the true value by the input normal equation. It is made like that. In the following, nine pixels that are updated at the same time are described as an updated pixel value tap.
[0034]
Here, the generated normal equation will be described. The normal equation is generated using pixel values in a range where the update pixel value tap and the prediction coefficient tap partially overlap. For example, when the pixel value of 3 × 3 pixels (pixels x1 to x9) centered on the pixel x5 shown in FIG. 5 is updated (updated pixel value tap), the updated pixel value in the area surrounded by the broken line The pixel values of the pixels other than the tap (pixels X1 to X40) and all the prediction coefficients c1 1 to c99 are fixed, and the prediction coefficient tap is moved within the area surrounded by the broken line to predict the pixel value of the lower image. To do.
[0035]
For example, when the center of the prediction coefficient tap is moved to a position overlapping with the pixel x3, the prediction coefficient tap (prediction coefficients c11 to c19) of

mode

1 and 3 × 3 pixels (pixels X11, X12, X13, x2, x3, X17, x5, x6, X21) and the pixel value of the pixel Y1 'at the upper left of the pixel Y5' of the lower image corresponding to the position of the pixel x3. (Predicted value) is calculated. This pixel value Y1 ′ can be expressed by the following equation (1).
[0036]
Y1 '= c11X11 + c12X12 + c13X13 + c14x2 + c15x3 + c16X17 + c17x5 + c18x6 + c19X21 (1)
Similarly, the pixel values Y2 'to Y9' are also expressed by a linear linear combination of the prediction coefficient and the pixel value of the upper image, and if the nine equations obtained are rewritten using a matrix, the following observations are made: The equation holds.
[0037]
Y '= cX
Here, Y ′ is a matrix composed of a set of pixel values Y1 ′ to Y9 ′, c is a matrix composed of a set of prediction coefficients c11 to c99, and X is a matrix composed of a set of pixel values of the upper image.
[0038]
Next, it is considered to obtain a predicted value Y ′ close to the pixel value of the original image by applying the least square method to this observation equation.
[0039]
Here, paying attention to the equation (1) that is the basis of the observation equation again, the difference between the predicted value Y1 ′ and the corresponding pixel value Y1 of the original image is as shown in the following equation (2).
[0040]
Y1-Y1 '= Y1-(c11X11 + c12X12 + c13X13 + c14x2 + c15x3 + c16X17 + c17x5 + c18x6 + c19X21) (2)
If the right side of Expression (2) is arranged, the following Expression (3) is obtained.
[0041]
Y1-Y1 '= Y1- (c11X11 + c12X12 + c13X13 + c16X17 + c19X21)-(c14x2 + c15x3 + c17x5 + c18x6) (3)
If the difference between the predicted value Y1 'and the corresponding pixel value Y1 of the original image, that is, the left side of equation (3) is the residual, the constant term on the right side is moved to the left side and rearranged, then the following equation (4) is obtained: obtain.
[0042]
Y1-(c11X11 + c12X12 + c13X13 + c16X17 + c19X21) + e1 = (c14x2 + c15x3 + c17x5 + c18x6) (4)
Further, using other modes (mode 2 to mode 9) of the prediction coefficient tap, the following formulas (5) to (12) similar to the formula (4) are also obtained from Yn−Yn ′ (n is 2 to 9). Get.
[0043]
Y2-(c21X11 + c22X12 + c23X13 + c26X17 + c29X21) + e2 = (c24x2 + c25x3 + c27x5 + c28x6) (5)
Y3-(c31X11 + c32X12 + c33X13 + c36X17 + c39X21) + e3 = (c34x2 + c35x3 + c37x5 + c38x6) (6)
Y4-(c41X11 + c42X12 + c43X13 + c46X17 + c49X21) + e4 = (c44x2 + c45x3 + c47x5 + c48x6) (7)
Y5-(c51X11 + c52X12 + c53X13 + c56X17 + c59X21) + e5 = (c54x2 + c55x3 + c57x5 + c58x6) (8)
Y6-(c61X11 + c62X12 + c63X13 + c66X17 + c69X21) + e6 = (c64x2 + c65x3 + c67x5 + c68x6) (9)
Y7-(c71X11 + c72X12 + c73X13 + c76X17 + c79X21) + e7 = (c74x2 + c75x3 + c77x5 + c78x6) (10)
Y8- (c81X11 + c82X12 + c83X13 + c86X17 + c89X21) + e8 = (c84x2 + c85x3 + c87x5 + c88x6) (11)
Y9-(c91X11 + c92X12 + c93X13 + c96X17 + c99X21) + e9 = (c94x2 + c95x3 + c97x5 + c98x6) (12)
Similarly, the position of the prediction coefficient tap is moved within the area surrounded by the broken line in FIG. 5, that is, the center of the prediction coefficient tap is all within the rectangular area having the pixels X9, X13, X28, and X32 as vertices. The pixels (25 pixels) are sequentially moved so as to overlap with each other, and 225 (= 9 × 25) equations similar to equations (4) to (12) are obtained using all modes of the prediction coefficient tap.
[0044]
If these 225 equations are represented in a matrix, the residual in the form of [teacher data] + [residual e] = [learning data c] × [predicted pixel value x] as shown in the following equation (13): It becomes an equation.
[0045]
[Expression 1]

[0046]
However, in order to simplify the notation, Expression (13) shows only the part corresponding to Expressions (4) to (12). Aij (i = 1, 2,..., M (= 225), j = 1, 2,..., 9) is data existing in i rows and j columns of the matrix [learning data c]. equal.
[0047]
In this case, the predicted pixel value xi for determining the predicted value Y ′ close to the pixel value Y of the original image can be determined by minimizing the following square error.
[0048]
[Expression 2]

[0049]
Accordingly, when the above-mentioned square error differentiated by the predicted pixel value xi is 0, that is, the predicted pixel value xi satisfying the following equation is optimal for obtaining the predicted value Y ′ close to the pixel value Y of the original image. Value.
[0050]
[Equation 3]

[0051]
Therefore, first, the following equation is established by differentiating the equation (13) by the predicted pixel value xi.
[0052]
[Expression 4]

[0053]
Expression (17) is obtained from Expression (14) and Expression (16).
[0054]
[Equation 5]

[0055]
Furthermore, when the teacher data (the pixel value Y of the original image—the constant term) of the equation (13) is Y ″, the relationship among the teacher data Y ″, the prediction coefficient c, the prediction pixel value x, and the residual e is considered. From equation (17), a normal equation such as the following equation (18) can be obtained.
[0056]
[Formula 6]

[0057]
Optimizing the updated pixel value tap corresponding to the prediction coefficient tap supplied from the prediction coefficient calculation circuit 4 by solving the obtained normal equation by applying, for example, a sweeping method (Gauss-Jordan elimination method) A simple pixel value can be obtained.
[0058]
Returning to the description of FIG. The mapping circuit 6 has pixel values of nine pixels of the updated pixel value tap centered on the target pixel supplied from the pixel

value update circuit

5 and 7 centered on the target pixel supplied from the prediction tap acquisition circuit 3. Linear primary combination of pixel values of 40 pixels excluding 3 × 3 pixels centered on the target pixel from × 7 pixels and prediction coefficients of prediction coefficients taps for 9 modes input from the prediction coefficient calculation circuit 4 As a result, the pixel values of the lower-level image are partially decoded locally (the range in which the pixel of the update pixel value tap affects). The pixel value of the locally decoded lower-order image is supplied to the error calculation circuit 7.
[0059]
The error calculation circuit 7 calculates an error between the pixel value of the locally decoded lower-order image from the mapping circuit 6 and the corresponding pixel value of the original image. In the following description, S / N is used as an error. S / N = 20log _Ten (255 / err) (err: standard deviation of error). When S / N is equal to or greater than the threshold value, it is determined that an optimal pixel has been generated, and the switch 8 is controlled to be turned on. In this case, the S / N may be evaluated over the entire image instead of evaluating the S / N with a partially locally decoded image.
[0060]
The frame memory 9 is configured to update and store a partially optimized upper image input from the upper layer image memory 2 via the switch 8 each time it is input. Therefore, after all the pixels of the upper image stored in the upper layer image memory 2 are set as the target pixel, the optimum upper image in which all the pixels are optimized is stored in the frame memory 9. ing.
[0061]
The optimum upper image stored in the frame memory 9 is output to a decoder (described later with reference to FIG. 13) at a predetermined timing together with prediction coefficient taps for nine modes. A control unit 10 is provided to control processing of the encoder described below. The control unit 10 receives the output of the error calculation circuit 7 and generates a signal for controlling the switch 8. Also, various control signals are supplied to each block in order to perform the encoder processing.
[0062]
Next, an outline of the optimum upper pixel value generation process of the encoder will be described with reference to the flowchart of FIG. The processing described below is described corresponding to the configuration of FIG. However, the present invention is not limited to the hardware having the configuration shown in FIG. 4, and may be executed by the CPU 114 according to a software program installed from the outside or stored in the ROM 112 in FIG. 3. In that case, the processing of each step is performed under the control of the CPU 114 in accordance with the software program.
[0063]
In step S1, the image reduction circuit 1 divides the supplied original image (high resolution image) into blocks each composed of 3 × 3 pixels, and an average value of pixel values of 9 pixels in each block is set at the center of the block. An initial upper image is generated as the pixel value of the pixel of the upper image (low-resolution image) that is positioned, and is stored in the upper layer image memory 2.
[0064]
The prediction tap acquisition circuit 3 sequentially determines the pixel of the upper image stored in the upper layer image memory 2 as the target pixel, and the 7 × 7 pixel centered on the target pixel from the upper layer image memory 2 Get the value. Of the obtained 49 pixel values, a pixel value of 3 × 3 pixels centered on the target pixel is supplied to the prediction coefficient calculation circuit 4. The pixel value update circuit 5 is supplied with all the acquired pixel values. Further, the pixel values of 40 (= 49−9) pixels excluding 3 × 3 pixels centered on the target pixel among the acquired 49 pixel values are supplied to the mapping circuit 6. For example, when the pixel x5 shown in FIG. 5 is determined as the pixel of interest, the pixel values of the prediction coefficient taps of 3 × 3 pixels (pixels x1 to x9) centered on the pixel of interest x5 are supplied to the prediction coefficient arithmetic circuit 4. Then, the pixel value of 7 × 7 pixels centered on the target pixel x5 is supplied to the pixel value update circuit 5, and from 7 × 7 pixels centered on the target pixel x5 to 3 × 3 pixels centered on the target pixel x5 The pixel values of 40 (= 49−9) pixels excluding are supplied to the mapping circuit 6.
[0065]
In step S2, the prediction coefficient calculation circuit 4 uses, as learning data (student data), a prediction tap of 3 × 3 pixels centered on the pixel of interest supplied from the prediction tap acquisition circuit 3, and the corresponding original image pixels are instructed. An observation equation is generated as data, and the prediction coefficient taps for nine modes are obtained by solving by applying the least square method, and supplied to the pixel value update circuit 5 and the mapping circuit 6. When obtaining the prediction coefficient, an equation is constructed for all the pixels in the screen.
[0066]
In step S 3, the normal equation generation circuit 11 of the pixel value update circuit 5 has a prediction coefficient tap input from the prediction coefficient calculation circuit 4 and a 7 × 7 pixel centered on the target pixel supplied from the prediction tap acquisition circuit 3. Using the pixel value and the corresponding pixel value of the original image, an observation equation as shown in Expression (13) is generated and output to the pixel value determining circuit 12. The pixel value determination circuit 12 solves the input observation equation by applying the least square method, and outputs the obtained pixel value of the updated pixel value tap to the upper layer image memory 2 and the mapping circuit 6.
[0067]
The upper layer image memory 2 uses the pixel value of the updated pixel value tap input from the pixel value update circuit 5 to update the pixel value of the corresponding pixel of the upper image stored so far. The mapping circuit 6 receives the pixel value of the updated pixel value tap input from the pixel

value update circuit

5 and 3 × 7 × 7 pixels centered on the target pixel from the predicted tap acquisition circuit 3 and centered on the target pixel. The linear primary combination of the pixel values of 40 pixels excluding × 3 pixels and the prediction coefficient taps for 9 modes input from the prediction coefficient calculation circuit 4 is calculated, and the pixel values of the lower image are partially local Decode. The pixel value of the locally decoded lower-order image is supplied to the error calculation circuit 7.
[0068]
In step S4, the error calculation circuit 7 calculates the S / N between the pixel value of the locally decoded lower image from the mapping circuit 6 and the corresponding pixel value of the original image, and S / N is equal to or greater than a predetermined threshold value. It is determined whether or not. If it is determined that S / N is not greater than or equal to the predetermined threshold, the processes in steps S2 to S4 are repeated. If it is determined that S / N is equal to or greater than the predetermined threshold value, the process proceeds to step S5.
[0069]
In step S 5, the switch 8 is turned on by the control of the error calculation circuit 7, and a partially optimized upper image is output from the upper layer image memory 2 to the frame memory 9 via the switch 8.
[0070]
By executing this optimum upper pixel value generation process for all the pixels of the upper image stored in the upper layer image memory 2, the frame memory 9 stores the optimum upper image in which all the pixels are optimized. Is memorized. The stored optimum upper image is output to the decoder at a predetermined timing together with prediction coefficient taps for nine modes.
[0071]
Several examples of the processing of the encoder in FIG. 9 will be described. The first method shown in the flowchart of FIG. 10 is an example in which each pixel is updated once for each update of the prediction coefficient.
[0072]
In step S21, the encoder generates an upper image by reducing the original image. Then, the encoder updates the prediction coefficients of all the pixels on the entire screen (step S22). In the next step S23, the encoder updates the pixel value of the block (synonymous with the updated pixel value tap). In step S24, it is determined whether or not the processing of all the blocks has been completed. If not, the process returns to step S22 and the processing is repeated.
[0073]
When it is determined in step S24 that the pixel values of all blocks have been updated, the encoder maps (locally decodes) the updated upper image and calculates an S / N indicating an error from the lower image. (Step S25). In step S26, the encoder determines whether S / N is greater than or equal to a threshold value. If S / N is equal to or greater than the threshold value, the updated upper image is output to the frame memory 9 and a prediction coefficient is output (step S27). If S / N is smaller than the threshold value in step S26, the process returns to step S22, and the processes after step S22 are repeated.
[0074]
FIG. 11 is a flowchart showing the second method. In the second method, the pixel value is updated for only one block with respect to one update of the prediction coefficient. Therefore, when the update of the pixel values of all the blocks has not been completed, only the point that the process returns to the update process of the prediction coefficient of all the screens in step S22 instead of step S23 (update of the pixel values). It is different from the flowchart.
[0075]
FIG. 12 is a flowchart showing the third method. In the third method, the updated pixel value is evaluated after updating the prediction coefficient and after updating the pixel value.
[0076]
In FIG. 12, after the original image reduction process (step S31), the prediction coefficients of all the pixels are updated (step S32). The encoder maps the updated upper image and calculates an S / N that is an error with the lower image (step S33). In step S34, it is determined whether S / N is equal to or greater than a threshold value. If S / N is equal to or greater than the threshold value, the encoder outputs the updated upper image to the frame memory 9 and outputs a prediction coefficient (step S35).
[0077]
If S / N is smaller than the threshold value in step S34, the process proceeds to step S36, and in step S36, the encoder updates the pixel value of the block. In step S37, it is determined whether or not the processing of all blocks has been completed. If not, the process returns to step S36 and the processing is repeated.
[0078]
If it is determined in step S37 that the pixel values of all blocks have been updated, the encoder maps the updated upper image and calculates the S / N that is an error from the lower image (step S38). . In step S39, it is determined whether S / N is equal to or greater than a threshold value. If S / N is equal to or greater than the threshold value, the encoder outputs the updated upper image to the frame memory 9 and outputs a prediction coefficient (step S35). If S / N is smaller than the threshold value in step S39, the process returns to step S32, and the processes after step S32 described above are repeated.
[0079]
In the above-described embodiment of the present invention, both the prediction coefficient and the pixel value of the upper image are optimized. However, in the present invention, it is also possible to optimize only the pixel value by obtaining the prediction coefficient in advance. In this case, the prediction coefficient is generated in advance by performing the same process as the prediction coefficient generation process in the encoder, using a digital image for coefficient determination. Further, since the prediction coefficient is shared by the encoder and the decoder, it is not necessary to record or transmit to the recording medium.
[0080]
Next, a configuration example of a decoder that restores an original image (predicts a lower image) from an optimal upper image output from an encoder will be described with reference to FIG. In this decoder, the optimum upper image input from the encoder is stored in the optimum upper layer image memory 21, and prediction coefficient taps for nine modes are supplied to the mapping circuit 23.
[0081]
The prediction tap acquisition circuit 22 sequentially determines the pixel of the optimal upper image stored in the optimal upper layer image memory 21 as the target pixel, and 3 × 3 pixels centering on the target pixel from the optimal upper layer image memory 21. Are obtained and output to the mapping circuit 23.
[0082]
The mapping circuit 23 calculates a linear primary combination of the pixel values of the nine pixels forming the prediction tap input from the prediction tap acquisition circuit 22 and the prediction coefficient taps for nine modes supplied from the encoder. Then, a pixel value of 3 × 3 pixels centering on the pixel of the lower image corresponding to the position of the target pixel is predicted (the pixel of the original image is restored). The predicted pixel value of 3 × 3 pixels of the lower-order image is output to the frame memory 24 and stored therein. The pixel values of the lower image stored in the frame memory 24 are output to a display (not shown) or the like at a predetermined timing for each frame.
[0083]
The original image restoration processing of this decoder will be described with reference to the flowchart in FIG. The original image restoration process is started after the optimum upper image generated by the encoder is stored in the optimum upper layer image memory 21 and the prediction coefficient taps for nine modes are supplied to the mapping circuit 23.
[0084]
In step S 11, the prediction tap acquisition circuit 22 determines one pixel among the pixels of the optimum upper image stored in the optimum upper layer image memory 21 as the target pixel. In step S 12, the prediction tap acquisition circuit 22 acquires a 3 × 3 pixel prediction tap centered on the target pixel from the optimal upper layer image memory 21 and outputs the prediction tap to the mapping circuit 23.
[0085]
In step S 13, the mapping circuit 23 performs a linear primary combination of the pixel values of the nine pixels forming the prediction tap input from the prediction tap acquisition circuit 22 and the prediction coefficient taps for nine modes supplied from the encoder. By calculating, a pixel value of 3 × 3 pixels centering on the pixel of the lower image corresponding to the position of the target pixel is predicted (the pixel of the original image is restored). The predicted pixel value of 3 × 3 pixels of the lower-order image is output to the frame memory 24 and stored.
[0086]
In step S 14, the prediction tap acquisition circuit 22 determines whether or not all the pixels of the optimum upper image stored in the optimum upper layer image memory 21 are determined as the target pixel, and determines all the pixels as the target pixels. Until it determines with having performed, the process of step S11 thru | or S14 is repeated. If it is determined that all the pixels have been determined as the target pixel, the process proceeds to step S15.
[0087]
In step S15, the pixel value of the lower image stored in the frame memory 24 is output to a display (not shown) or the like at a predetermined timing for each frame.
[0088]
According to the present embodiment, it is possible to obtain a higher-order image having a larger S / N of the restored image as compared with the conventional method.
[0089]
【The invention's effect】
As described above, according to the present invention, the pixel values of a plurality of pixels can be simultaneously optimized in units of blocks. Thereby, processing can be simplified and processing time can be shortened.
[Brief description of the drawings]
FIG. 1 is a schematic diagram showing a pixel arrangement for explaining the previously proposed encoding;
FIG. 2 is a block diagram showing an overall configuration of an image data conversion apparatus to which the present invention is applied.
3 is a block diagram illustrating a functional configuration example of a transmission device in FIG. 2. FIG.
FIG. 4 is a block diagram showing a configuration example of an encoder to which the present invention is applied.
FIG. 5 is a diagram for explaining processing of a prediction tap acquisition circuit 3 in FIG. 4;
FIG. 6 is a diagram for explaining a prediction coefficient tap.
FIG. 7 is a diagram illustrating a lower layer image.
8 is a block diagram illustrating a configuration example of a pixel value update circuit 5 in FIG. 4. FIG.
FIG. 9 is a flowchart for explaining an outline of optimum pixel value generation processing of the encoder of FIG. 4;
10 is a flowchart illustrating an example of an optimal pixel value generation process of the encoder of FIG.
FIG. 11 is a flowchart for explaining another example of the optimum pixel value generation process of the encoder of FIG. 4;
12 is a flowchart for explaining still another example of the optimum pixel value generation process of the encoder of FIG.
FIG. 13 is a block diagram illustrating a configuration example of a decoder that restores an original image from an optimal upper image generated by the encoder of FIG. 4;
14 is a flowchart illustrating original image restoration processing of the decoder in FIG.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 ... Image reduction circuit, 2 ... Upper layer image memory, 3 ... Prediction tap acquisition circuit, 4 ... Prediction coefficient arithmetic circuit, 5 ... Pixel value update circuit, 6 ... Mapping circuit , 7... Error calculation circuit, 11... Normal equation generation circuit, 12.

Claims

In an image data conversion device for converting first image data into second image data having a lower quality than the first image data,
An intermediate image data generating unit that generates intermediate image data substantially the same quality as the second image data from the first image data;
A storage unit for storing the intermediate image data;
From the intermediate image data, the block extracting unit for extracting a plurality of image elements of each block is a part in one screen,
A prediction coefficient generation unit that outputs a prediction coefficient generated or acquired in advance;
A pixel value update unit that collectively updates pixel values of a plurality of pixels of the intermediate image data extracted by the block extraction unit based on the prediction coefficient, the intermediate image data, and the first image data;
A predicted image data generation unit that generates predicted image data substantially the same quality as the first image data based on the intermediate image data in which the pixel value is updated by the pixel value update unit and the prediction coefficient;
An error detection unit for detecting an error between the first image data and the predicted image data;
An image data conversion apparatus comprising: a control unit that determines whether or not the intermediate image data is an output image based on the error.

The prediction coefficient generation unit generates a prediction coefficient based on the intermediate image extracted by the block extraction unit and the first image data at a position corresponding to the extracted intermediate image data. Item 2. The image data conversion device according to Item 1.

The image data conversion apparatus according to claim 1, wherein the error detection unit detects an error between the first image data for one screen and the predicted pixel data for one screen.

The pixel value update unit
A normal equation generation unit that generates a normal equation based on the prediction coefficient, the intermediate image data, and the first image data;
The image data conversion apparatus according to claim 1, further comprising: a pixel value determination unit that determines an updated pixel value of the intermediate image data by solving the normal equation.

The image data conversion apparatus according to claim 3, wherein the pixel value determination unit solves the normal equation using a least square method.

In an image data conversion method for converting first image data into second image data having a lower quality than the first image data,
Generating intermediate image data substantially the same quality as the second image data from the first image data;
From the intermediate image data, extracting a plurality of image elements of each block is a part in one screen,
A step of outputting a prediction coefficient generated or acquired in advance; a plurality of pixels of the intermediate image data extracted by the block extraction unit based on the prediction coefficient, the intermediate image data, and the first image data; Updating the pixel values of all at once ;
Generating predicted image data substantially the same quality as the first image data based on the intermediate image data in which the pixel value is updated and the prediction coefficient;
Detecting an error between the first image data and the predicted image data;
And determining whether or not the intermediate image data is to be an output image based on the error.

In the learning device that learns the pixel value of the second image data when converting the first image data into the second image data having a lower quality than the first image data,
An intermediate image data generating unit that generates intermediate image data substantially the same quality as the second image data from the first image data;
A storage unit for storing the intermediate image data;
From the intermediate image data, the block extracting unit for extracting a plurality of image elements of each block is a part in one screen,
A prediction coefficient generation unit that outputs a prediction coefficient generated or acquired in advance;
A pixel value update unit that collectively updates pixel values of a plurality of pixels of the intermediate image data extracted by the block extraction unit based on the prediction coefficient, the intermediate image data, and the first image data;
A predicted image data generation unit that generates predicted image data substantially the same quality as the first image data based on the intermediate image data in which the pixel value is updated by the pixel value update unit and the prediction coefficient;
An error detection unit for detecting an error between the first image data and the predicted image data;
A controller that determines whether the intermediate image data is an output image based on the error,
The pixel value update unit updates the pixel value of the intermediate image data by a least square method using the prediction coefficient as student data and the corresponding first image data as teacher data. .

In the learning method of learning the pixel value of the second image data when converting the first image data into the second image data having a lower quality than the first image data,
Generating intermediate image data substantially the same quality as the second image data from the first image data;
From the intermediate image data, extracting a plurality of image elements of each block is a part in one screen,
Outputting generated or pre-obtained prediction coefficients;
And updating collectively the pixel values of a plurality of pixels of the intermediate image data extracted on the basis of the above prediction coefficients and the intermediate image data and the first image data,
Generating predicted image data substantially the same quality as the first image data based on the intermediate image data in which the pixel value is updated and the prediction coefficient;
Detecting an error between the first image data and the predicted image data;
Determining whether to use the intermediate image data as an output image based on the error,
The step of updating the pixel value comprises updating the pixel value of the intermediate image data by a least square method using the prediction coefficient as student data and the corresponding first image data as teacher data. Method.

In a recording medium on which a computer-controllable program for converting image data for converting first image data into second image data of lower quality than the first image data is recorded,
The above program
Generating intermediate image data substantially the same quality as the second image data from the first image data;
From the intermediate image data, extracting a plurality of image elements of each block is a part in one screen,
Generating a prediction coefficient based on the extracted intermediate image and first image data at a position corresponding to the extracted intermediate image data;
Collectively updating pixel values of a plurality of pixels of the extracted intermediate image data based on the prediction coefficient, the intermediate image data, and the first image data;
Generating predicted image data substantially the same quality as the first image data based on the intermediate image data in which the pixel value is updated and the prediction coefficient;
Detecting an error between the first image data and the predicted image data;
And determining whether to use the intermediate image data as an output image based on the error.