JP4752088B2

JP4752088B2 - Data processing apparatus, data processing method, and recording medium

Info

Publication number: JP4752088B2
Application number: JP2000135357A
Authority: JP
Inventors: 哲二郎近藤; 俊彦浜松; 秀雄中屋; 丈晴西片; 秀樹大塚; 威國弘; 孝文森藤; 真史内田
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2000-05-09
Filing date: 2000-05-09
Publication date: 2011-08-17
Anticipated expiration: 2020-05-09
Also published as: JP2001320587A

Description

【０００１】
【発明の属する技術分野】
本発明は、データ処理装置およびデータ処理方法、並びに記録媒体に関し、特に、例えば、圧縮された画像を高画質の画像に復号する場合等に用いて好適なデータ処理装置およびデータ処理方法、並びに記録媒体に関する。
【０００２】
【従来の技術】
例えば、ディジタル画像データは、そのデータ量が多いため、そのまま記録や伝送を行うには、大容量の記録媒体や伝送媒体が必要となる。そこで、一般には、画像データを圧縮符号化することにより、そのデータ量を削減してから、記録や伝送が行われる。
【０００３】
画像を圧縮符号化する方式としては、例えば、静止画の圧縮符号化方式であるＪＰＥＧ(Joint Photographic Experts Group)方式や、動画の圧縮符号化方式であるＭＰＥＧ(Moving Picture Experts Group)方式等がある。
【０００４】
例えば、ＪＰＥＧ方式による画像データの符号化／復号は、図１に示すように行われる。
【０００５】
即ち、図１（Ａ）は、従来のＪＰＥＧ符号化装置の一例の構成を示している。
【０００６】
符号化対象の画像データは、ブロック化回路１に入力され、ブロック化回路１は、そこに入力される画像データを、８×８画素の６４画素でなるブロックに分割する。ブロック化回路１で得られる各ブロックは、ＤＣＴ(Discrete Cosine Transform)回路２に供給される。ＤＣＴ回路２は、ブロック化回路１からのブロックに対して、ＤＣＴ（離散コサイン変換）処理を施し、１個のＤＣ(Direct Current)成分と、水平方向および垂直方向についての６３個の周波数成分（ＡＣ(Alternating Current)成分）の、合計６４個のＤＣＴ係数に変換する。各ブロックごとの６４個のＤＣＴ係数は、ＤＣＴ回路２から量子化回路３に供給される。
【０００７】
量子化回路３は、所定の量子化テーブルにしたがって、ＤＣＴ回路２からのＤＣＴ係数を量子化し、その量子化結果（以下、適宜、量子化ＤＣＴ係数という）を、量子化に用いた量子化テーブルとともに、エントロピー符号化回路４に供給する。
【０００８】
ここで、図１（Ｂ）は、量子化回路３において用いられる量子化テーブルの例を示している。量子化テーブルには、一般に、人間の視覚特性を考慮して、重要性の高い低周波数のＤＣＴ係数は細かく量子化し、重要性の低い高周波数のＤＣＴ係数は粗く量子化するような量子化ステップが設定されており、これにより、画像の画質の劣化を抑えて、効率の良い圧縮が行われるようになっている。
【０００９】
エントロピー符号化回路４は、量子化回路３からの量子化ＤＣＴ係数に対して、例えば、ハフマン符号化等のエントロピー符号化処理を施して、量子化回路３からの量子化テーブルを付加し、その結果得られる符号化データを、ＪＰＥＧ符号化結果として出力する。
【００１０】
次に、図１（Ｃ）は、図１（Ａ）のＪＰＥＧ符号化装置が出力する符号化データを復号する、従来のＪＰＥＧ復号装置の一例の構成を示している。
【００１１】
符号化データは、エントロピー復号回路１１に入力され、エントロピー復号回路１１は、符号化データを、エントロピー符号化された量子化ＤＣＴ係数と、量子化テーブルとに分離する。さらに、エントロピー復号回路１１は、エントロピー符号化された量子化ＤＣＴ係数をエントロピー復号し、その結果得られる量子化ＤＣＴ係数を、量子化テーブルとともに、逆量子化回路１２に供給する。逆量子化回路１２は、エントロピー復号回路１１からの量子化ＤＣＴ係数を、同じくエントロピー復号回路１１からの量子化テーブルにしたがって逆量子化し、その結果得られるＤＣＴ係数を、逆ＤＣＴ回路１３に供給する。逆ＤＣＴ回路１３は、逆量子化回路１２からのＤＣＴ係数に、逆ＤＣＴ処理を施し、その結果得られる８×８画素の復号ブロックを、ブロック分解回路１４に供給する。ブロック分解回路１４は、逆ＤＣＴ回路１３からの復号ブロックのブロック化を解くことで、復号画像を得て出力する。
【００１２】
【発明が解決しようとする課題】
図１（Ａ）のＪＰＥＧ符号化装置では、その量子化回路３において、ブロックの量子化に用いる量子化テーブルの量子化ステップを大きくすることにより、符号化データのデータ量を削減することができる。即ち、高圧縮を実現することができる。
【００１３】
しかしながら、量子化ステップを大きくすると、いわゆる量子化誤差も大きくなることから、図１（Ｃ）のＪＰＥＧ復号装置で得られる復号画像の画質が劣化する。即ち、復号画像には、ぼけや、ブロック歪み、モスキートノイズ等が顕著に現れる。
【００１４】
従って、符号化データのデータ量を削減しながら、復号画像の画質を劣化させないようにするには、あるいは、符号化データのデータ量を維持して、復号画像の画質を向上させるには、ＪＰＥＧ復号した後に、何らかの画質向上のための処理を行う必要がある。
【００１５】
しかしながら、ＪＰＥＧ復号した後に、画質向上のための処理を行うことは、処理が煩雑になり、最終的に復号画像が得られるまでの時間も長くなる。
【００１６】
本発明は、このような状況に鑑みてなされたものであり、ＪＰＥＧ符号化された画像等から、効率的に、画質の良い復号画像を得ること等ができるようにするものである。
【００１７】
【課題を解決するための手段】
本発明の第１のデータ処理装置は、学習を行うことにより求められたタップ係数を取得する取得手段と、タップ係数および変換データを用いて、所定の予測演算を行うことにより、変換データを、元のデータに復号するのと同時に、その元のデータに所定の処理を施した処理データを得る演算手段とを備えることを特徴とする。
【００１８】
第１のデータ処理装置において、演算手段には、タップ係数および変換データを用いて、線形１次予測演算を行わせることができる。
【００１９】
第１のデータ処理装置には、タップ係数を記憶している記憶手段をさらに設けることができ、この場合、取得手段には、記憶手段から、タップ係数を取得させることができる。
【００２０】
第１のデータ処理装置において、変換データは、元のデータを、直交変換または周波数変換し、さらに量子化することにより得られたものとすることができる。
【００２１】
第１のデータ処理装置には、変換データを逆量子化する逆量子化手段をさらに設けることができ、演算手段には、逆量子化された変換データを用いて予測演算を行わせることができる。
【００２２】
第１のデータ処理装置において、変換データは、元のデータを、少なくとも、離散コサイン変換したものとすることができる。
【００２３】
第１のデータ処理装置には、処理データのうちの、注目している注目データを予測するのにタップ係数とともに用いる変換データを抽出し、予測タップとして出力する予測タップ抽出手段をさらに設けることができ、この場合、演算手段には、予測タップおよびタップ係数を用いて予測演算を行わせることができる。
【００２４】
第１のデータ処理装置には、注目データを、幾つかのクラスのうちのいずれかにクラス分類するのに用いる変換データを抽出し、クラスタップとして出力するクラスタップ抽出手段と、クラスタップに基づいて、注目データのクラスを求めるクラス分類を行うクラス分類手段とをさらに設けることができ、この場合、演算手段には、予測タップおよび注目データのクラスに対応するタップ係数を用いて予測演算を行わせることができる。
【００２５】
第１のデータ処理装置において、演算手段では、所定の予測演算を行うことにより、元のデータに、その品質を向上させる処理を施した処理データを得るようにすることができる。
【００２６】
第１のデータ処理装置において、タップ係数は、タップ係数および変換データを用いて、所定の予測演算を行うことにより得られる処理データの予測値の予測誤差が、統計的に最小になるように、学習を行うことにより得られたものとすることができる。
【００２７】
第１のデータ処理装置において、元のデータは、動画または静止画の画像データとすることができる。
【００２８】
第１のデータ処理装置において、演算手段では、所定の予測演算を行うことにより、画像データに、その画質を向上させる処理を施した処理データを得るようにすることができる。
【００２９】
第１のデータ処理装置において、演算手段では、画像データの時間または空間方向の解像度を向上させた処理データを得るようにすることができる。
【００３０】
本発明の第１のデータ処理方法は、学習を行うことにより求められたタップ係数を取得する取得ステップと、タップ係数および変換データを用いて、所定の予測演算を行うことにより、変換データを、元のデータに復号するのと同時に、その元のデータに所定の処理を施した処理データを得る演算ステップとを備えることを特徴とする。
【００３１】
本発明の第１の記録媒体は、学習を行うことにより求められたタップ係数を取得する取得ステップと、タップ係数および変換データを用いて、所定の予測演算を行うことにより、変換データを、元のデータに復号するのと同時に、その元のデータに所定の処理を施した処理データを得る演算ステップとを備えるプログラムが記録されていることを特徴とする。
【００３２】
本発明の第２のデータ処理装置は、教師となる教師データに、所定の処理に基づく処理を施し、準教師データを得る準教師データ生成手段と、準教師データを、少なくとも、直交変換または周波数変換することにより、生徒となる生徒データを生成する生徒データ生成手段と、タップ係数および生徒データを用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習を行い、タップ係数を求める学習手段とを備えることを特徴とする。
【００３３】
第２のデータ処理装置において、学習手段には、タップ係数および生徒データを用いて線形１次予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習を行わせることができる。
【００３４】
第２のデータ処理装置において、生徒データ生成手段には、準教師データを、直交変換または周波数変換し、さらに量子化することにより、生徒データを生成させることができる。
【００３５】
第２のデータ処理装置において、生徒データ生成手段には、準教師データを、直交変換または周波数変換して量子化し、さらに逆量子化することにより、生徒データを生成させることができる。
【００３６】
第２のデータ処理装置において、生徒データ生成手段には、準教師データを、少なくとも、離散コサイン変換することにより、生徒データを生成させることができる。
【００３７】
第２のデータ処理装置には、教師データのうちの、注目している注目教師データを予測するのにタップ係数とともに用いる生徒データを抽出し、予測タップとして出力する予測タップ抽出手段をさらに設けることができ、この場合、学習手段には、予測タップおよびタップ係数を用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習を行わせることができる。
【００３８】
第２のデータ処理装置には、注目教師データを、幾つかのクラスのうちのいずれかにクラス分類するのに用いる生徒データを抽出し、クラスタップとして出力するクラスタップ抽出手段と、クラスタップに基づいて、注目教師データのクラスを求めるクラス分類を行うクラス分類手段とをさらに設けることができ、この場合、学習手段には、予測タップおよび注目教師データのクラスに対応するタップ係数を用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習を行わせ、クラスごとのタップ係数を求めさせることができる。
【００３９】
第２のデータ処理装置において、生徒データ生成手段には、準教師データを、所定の単位ごとに、少なくとも、直交変換処理または周波数変換することにより、生徒データを生成させることができる。
【００４０】
第２のデータ処理装置において、準教師データ生成手段には、教師データに、その品質を劣化させる処理を施すことにより、準教師データを生成させることができる。
【００４１】
第２のデータ処理装置において、教師データは、動画または静止画の画像データとすることができる。
【００４２】
第２のデータ処理装置において、準教師データ生成手段には、画像データに、その画質を劣化させる処理を施すことにより、準教師データを生成させることができる。
【００４３】
第２のデータ処理装置において、準教師データ生成手段には、画像データの時間または空間方向の解像度を劣化させた準教師データを生成させることができる。
【００４４】
本発明の第２のデータ処理方法は、教師となる教師データに、所定の処理に基づく処理を施し、準教師データを得る準教師データ生成ステップと、準教師データを、少なくとも、直交変換または周波数変換することにより、生徒となる生徒データを生成する生徒データ生成ステップと、タップ係数および生徒データを用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習を行い、タップ係数を求める学習ステップとを備えることを特徴とする。
【００４５】
本発明の第２の記録媒体は、教師となる教師データに、所定の処理に基づく処理を施し、準教師データを得る準教師データ生成ステップと、準教師データを、少なくとも、直交変換または周波数変換することにより、生徒となる生徒データを生成する生徒データ生成ステップと、タップ係数および生徒データを用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習を行い、タップ係数を求める学習ステップとを備えるプログラムが記録されていることを特徴とする。
【００４６】
本発明の第１のデータ処理装置およびデータ処理方法、並びに記録媒体においては、学習を行うことにより求められたタップ係数が取得され、そのタップ係数および変換データを用いて、所定の予測演算が行われることにより、変換データを、元のデータに復号するのと同時に、その元のデータに所定の処理を施した処理データが得られる。
【００４７】
本発明の第２のデータ処理装置およびデータ処理方法、並びに記録媒体においては、教師となる教師データに、所定の処理に基づく処理が施され、その結果得られる準教師データを、少なくとも、直交変換または周波数変換することにより、生徒となる生徒データが生成される。そして、タップ係数および生徒データを用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習が行われ、タップ係数が求められる。
【００４８】
【発明の実施の形態】
図２は、本発明を適用した画像伝送システムの一実施の形態の構成例を示している。
【００４９】
伝送すべき画像データは、エンコーダ２１に供給されるようになっており、エンコーダ２１は、そこに供給される画像データを、例えば、ＪＰＥＧ符号化し、符号化データとする。即ち、エンコーダ２１は、例えば、前述の図１（Ａ）に示したＪＰＥＧ符号化装置と同様に構成されており、画像データをＪＰＥＧ符号化する。エンコーダ２１がＪＰＥＧ符号化を行うことにより得られる符号化データは、例えば、半導体メモリ、光磁気ディスク、磁気ディスク、光ディスク、磁気テープ、相変化ディスクなどでなる記録媒体２３に記録され、あるいは、また、例えば、地上波、衛星回線、ＣＡＴＶ（Cable Television）網、インターネット、公衆回線などでなる伝送媒体２４を介して伝送される。
【００５０】
デコーダ２２は、記録媒体２３または伝送媒体２４を介して提供される符号化データを受信して、高画質の画像データに復号する。この復号化された高画質の画像データは、例えば、図示せぬモニタに供給されて表示等される。
【００５１】
次に、図３は、図２のデコーダ２２の構成例を示している。
【００５２】
符号化データは、エントロピー復号回路３１に供給されるようになっており、エントロピー復号回路３１は、符号化データを、エントロピー復号して、その結果得られるブロックごとの量子化ＤＣＴ係数Ｑを、係数変換回路３２に供給する。なお、符号化データには、図１（Ｃ）のエントロピー復号回路１１で説明した場合と同様に、エントロピー符号化された量子化ＤＣＴ係数の他、量子化テーブルも含まれるが、この量子化テーブルは、後述するように、必要に応じて、量子化ＤＣＴ係数の復号に用いることが可能である。
【００５３】
係数変換回路３２は、エントロピー復号回路３１からの量子化ＤＣＴ係数Ｑと、後述する学習を行うことにより求められるタップ係数を用いて、所定の予測演算を行うことにより、ブロックごとの量子化ＤＣＴ係数を、８×８画素の元のブロックに復号し、かつ、さらに、その元のブロックの画質を向上させる処理を施したデータを得る。即ち、元のブロックは８×８画素で構成されるが、係数変換回路３２は、タップ係数を用いた予測演算を行うことにより、その８×８画素のブロックの横および縦方向の空間解像度を、いずれも２倍にした１６×１６画素でなるブロックを得る。従って、係数変換回路３２は、ここでは、図４に示すように、８×８の量子化ＤＣＴ係数で構成されるブロックを、１６×１６画素で構成されるブロックに復号して出力する。
【００５４】
ブロック分解回路３３は、係数変換回路３２において得られる１６×１６画素のブロックのブロック化を解くことで、空間解像度を向上させた復号画像を得て出力する。
【００５５】
次に、図５のフローチャートを参照して、図３のデコーダ２２の処理について説明する。
【００５６】
符号化データは、エントロピー復号回路３１に順次供給され、ステップＳ１において、エントロピー復号回路３１は、符号化データをエントロピー復号し、ブロックごとの量子化ＤＣＴ係数Ｑを、係数変換回路３２に供給する。係数変換回路３２は、ステップＳ２において、エントロピー復号回路３１からのブロックごとの量子化ＤＣＴ係数Ｑを、タップ係数を用いた予測演算を行うことにより、ブロックごとの画素値に復号し、かつ、そのブロックの空間解像度を向上させた、いわば高解像度のブロックを得て、ブロック分解回路３３に供給する。ブロック分解回路３３は、ステップＳ３において、係数変換回路３２からの、空間解像度が向上された画素値のブロックのブロック化を解くブロック分解を行い、その結果得られる高解像度の復号画像を出力して、処理を終了する。
【００５７】
次に、図３の係数変換回路３２では、例えば、クラス分類適応処理を利用して、量子化ＤＣＴ係数を、画素値に復号し、さらに、その空間解像度を向上させた画像を得ることができる。
【００５８】
クラス分類適応処理は、クラス分類処理と適応処理とからなり、クラス分類処理によって、データを、その性質に基づいてクラス分けし、各クラスごとに適応処理を施すものであり、適応処理は、以下のような手法のものである。なお、ここでは、説明を簡単にするために、適応処理について、量子化ＤＣＴ係数を、元の画像に復号する場合を例に説明する。
【００５９】
この場合、適応処理では、例えば、量子化ＤＣＴ係数と、所定のタップ係数との線形結合により、元の画素の予測値を求めることで、量子化ＤＣＴ係数が、元の画素値に復号される。
【００６０】
具体的には、例えば、いま、ある画像を教師データとするとともに、その画像を、ブロック単位でＤＣＴ処理し、さらに量子化して得られる量子化ＤＣＴ係数を生徒データとして、教師データである画素の画素値ｙの予測値Ｅ［ｙ］を、幾つかの量子化ＤＣＴ係数ｘ₁，ｘ₂，・・・の集合と、所定のタップ係数ｗ₁，ｗ₂，・・・の線形結合により規定される線形１次結合モデルにより求めることを考える。この場合、予測値Ｅ［ｙ］は、次式で表すことができる。
【００６１】
Ｅ［ｙ］＝ｗ₁ｘ₁＋ｗ₂ｘ₂＋・・・
・・・（１）
【００６２】
式（１）を一般化するために、タップ係数ｗ_jの集合でなる行列Ｗ、生徒データｘ_ijの集合でなる行列Ｘ、および予測値Ｅ［ｙ_j］の集合でなる行列Ｙ’を、
【数１】

で定義すると、次のような観測方程式が成立する。
【００６３】
ＸＷ＝Ｙ’
・・・（２）
ここで、行列Ｘの成分ｘ_ijは、ｉ件目の生徒データの集合（ｉ件目の教師データｙ_iの予測に用いる生徒データの集合）の中のｊ番目の生徒データを意味し、行列Ｗの成分ｗ_jは、生徒データの集合の中のｊ番目の生徒データとの積が演算されるタップ係数を表す。また、ｙ_iは、ｉ件目の教師データを表し、従って、Ｅ［ｙ_i］は、ｉ件目の教師データの予測値を表す。なお、式（１）の左辺におけるｙは、行列Ｙの成分ｙ_iのサフィックスｉを省略したものであり、また、式（１）の右辺におけるｘ₁，ｘ₂，・・・も、行列Ｘの成分ｘ_ijのサフィックスｉを省略したものである。
【００６４】
そして、この観測方程式に最小自乗法を適用して、元の画素値ｙに近い予測値Ｅ［ｙ］を求めることを考える。この場合、教師データとなる真の画素値ｙの集合でなる行列Ｙ、および画素値ｙに対する予測値Ｅ［ｙ］の残差ｅの集合でなる行列Ｅを、
【００６５】
【数２】

で定義すると、式（２）から、次のような残差方程式が成立する。
【００６６】
ＸＷ＝Ｙ＋Ｅ
・・・（３）
【００６７】
この場合、元の画素値ｙに近い予測値Ｅ［ｙ］を求めるためのタップ係数ｗ_jは、自乗誤差
【数３】

を最小にすることで求めることができる。
【００６８】
従って、上述の自乗誤差をタップ係数ｗ_jで微分したものが０になる場合、即ち、次式を満たすタップ係数ｗ_jが、元の画素値ｙに近い予測値Ｅ［ｙ］を求めるため最適値ということになる。
【００６９】
【数４】

・・・（４）
【００７０】
そこで、まず、式（３）を、タップ係数ｗ_jで微分することにより、次式が成立する。
【００７１】
【数５】

・・・（５）
【００７２】
式（４）および（５）より、式（６）が得られる。
【００７３】
【数６】

・・・（６）
【００７４】
さらに、式（３）の残差方程式における生徒データｘ_ij、タップ係数ｗ_j、教師データｙ_i、および残差ｅ_iの関係を考慮すると、式（６）から、次のような正規方程式を得ることができる。
【００７５】
【数７】

・・・（７）
【００７６】
なお、式（７）に示した正規方程式は、行列（共分散行列）Ａおよびベクトルｖを、
【数８】

で定義するとともに、ベクトルＷを、数１で示したように定義すると、式
ＡＷ＝ｖ
・・・（８）
で表すことができる。
【００７７】
式（７）における各正規方程式は、生徒データｘ_ijおよび教師データｙ_iのセットを、ある程度の数だけ用意することで、求めるべきタップ係数ｗ_jの数Ｊと同じ数だけたてることができ、従って、式（８）を、ベクトルＷについて解くことで（但し、式（８）を解くには、式（８）における行列Ａが正則である必要がある）、最適なタップ係数（ここでは、自乗誤差を最小にするタップ係数）ｗ_jを求めることができる。なお、式（８）を解くにあたっては、例えば、掃き出し法（Gauss-Jordanの消去法）などを用いることが可能である。
【００７８】
以上のようにして、最適なタップ係数ｗ_jを求めておき、さらに、そのタップ係数ｗ_jを用い、式（１）により、元の画素値ｙに近い予測値Ｅ［ｙ］を求めるのが適応処理である。
【００７９】
なお、例えば、教師データとして、ＪＰＥＧ符号化する画像と同一画質の画像を用いるとともに、生徒データとして、その教師データをＤＣＴおよび量子化して得られる量子化ＤＣＴ係数を用いた場合、タップ係数としては、ＪＰＥＧ符号化された画像データを、元の画像データに復号するのに、予測誤差が、統計的に最小となるものが得られることになる。
【００８０】
従って、ＪＰＥＧ符号化を行う際の圧縮率を高くしても、即ち、量子化に用いる量子化ステップを粗くしても、適応処理によれば、予測誤差が、統計的に最小となる復号処理が施されることになり、実質的に、ＪＰＥＧ符号化された画像の復号処理と、その画質を向上させるための処理（以下、適宜、向上処理という）とが、同時に施されることになる。その結果、圧縮率を高くしても、復号画像の画質を維持することができる。
【００８１】
また、例えば、教師データとして、ＪＰＥＧ符号化する画像よりも高画質の画像を用いるとともに、生徒データとして、その教師データの画質を、ＪＰＥＧ符号化する画像と同一画質に劣化させ、さらに、ＤＣＴおよび量子化して得られる量子化ＤＣＴ係数を用いた場合、タップ係数としては、ＪＰＥＧ符号化された画像データを高画質の画像データに復号するのに、予測誤差が、統計的に最小となるものが得られることになる。
【００８２】
従って、この場合も、適応処理によれば、ＪＰＥＧ符号化された画像の復号処理と、その画質をより向上させるための向上処理とが、同時に施されることになる。なお、上述したことから、教師データまたは生徒データとなる画像の画質を変えることで、復号画像の画質を任意のレベルとするタップ係数を得ることができる。
【００８３】
図６は、以上のようなクラス分類適応処理により、量子化ＤＣＴ係数を画素値に復号する、図３の係数変換回路３２の第１の構成例を示している。
【００８４】
エントロピー復号回路３１（図３）が出力するブロックごとの量子化ＤＣＴ係数は、予測タップ抽出回路４１およびクラスタップ抽出回路４２に供給されるようになっている。
【００８５】
予測タップ抽出回路４１は、そこに供給される８×８の量子化ＤＣＴ係数のブロック（以下、適宜、ＤＣＴブロックという）に対応する高画質の画素値のブロック（この画素値のブロックは、現段階では存在しないが、仮想的に想定される）（以下、適宜、高画質ブロックという）（本実施の形態では、上述したように、１６×１６画素のブロック）を、順次、注目高画質ブロックとし、さらに、その注目高画質ブロックを構成する各画素を、例えば、いわゆるラスタスキャン順に、順次、注目画素とする。さらに、予測タップ抽出回路４１は、注目画素の画素値を予測するのに用いる量子化ＤＣＴ係数を抽出し、予測タップとする。
【００８６】
即ち、予測タップ抽出回路４１は、例えば、図７に示すように、注目画素が属する高画質ブロックに対応するＤＣＴブロックのすべての量子化ＤＣＴ係数、即ち、８×８の６４個の量子化ＤＣＴ係数を、予測タップとして抽出する。従って、本実施の形態では、ある高画質ブロックのすべての画素について、同一の予測タップが構成される。但し、予測タップは、注目画素ごとに、異なる量子化ＤＣＴ係数で構成することが可能である。
【００８７】
予測タップ抽出回路４１において得られる、高画質ブロックを構成する各画素についての予測タップ、即ち、１６×１６の２５６画素それぞれについての２５６セットの予測タップは、積和演算回路４５に供給される。但し、本実施の形態では、上述したように、高画質ブロックのすべての画素について、同一の予測タップが構成されるので、実際には、１つの高画質ブロックに対して、１セットの予測タップを、積和演算回路４５に供給すれば良い。
【００８８】
クラスタップ抽出回路４２は、注目画素を、幾つかのクラスのうちのいずれかに分類するためのクラス分類に用いる量子化ＤＣＴ係数を抽出して、クラスタップとする。
【００８９】
なお、ＪＰＥＧ符号化では、画像が、８×８画素のブロック（以下、適宜、画素ブロックという）ごとに符号化（ＤＣＴ処理および量子化）されることから、ある画素ブロックを高画質化した高画質ブロックに属する画素は、例えば、すべて同一のクラスにクラス分類することとする。従って、クラスタップ抽出回路４２は、ある高画質ブロックの各画素については、同一のクラスタップを構成する。即ち、クラスタップ抽出回路４２は、例えば、予測タップ抽出回路４１における場合と同様に、図７に示したような、注目画素が属する高画質ブロックに対応するＤＣＴブロックの８×８個のすべての量子化ＤＣＴ係数を、クラスタップとして抽出する。
【００９０】
ここで、高画質ブロックに属する各画素を、すべて同一のクラスにクラス分類するということは、その高画質ブロックをクラス分類することと等価である。従って、クラスタップ抽出回路４２には、注目高画質ブロックを構成する１６×１６の合計２５６画素それぞれをクラス分類するための２５６セットのクラスタップではなく、注目高画質ブロックをクラス分類するための１セットのクラスタップを構成させれば良く、このため、クラスタップ抽出回路４２は、高画質ブロックごとに、その高画質ブロックをクラス分類するために、その高画質ブロックに対応するＤＣＴブロックの６４個の量子化ＤＣＴ係数を抽出して、クラスタップとするようになっている。
【００９１】
なお、予測タップやクラスタップを構成する量子化ＤＣＴ係数は、上述したパターンのものに限定されるものではない。
【００９２】
クラスタップ抽出回路４２において得られる、注目高画質ブロックのクラスタップは、クラス分類回路４３に供給されるようになっており、クラス分類回路４３は、クラスタップ抽出回路４２からのクラスタップに基づき、注目高画質ブロックをクラス分類し、その結果得られるクラスに対応するクラスコードを出力する。
【００９３】
ここで、クラス分類を行う方法としては、例えば、ADRC(Adaptive Dynamic Range Coding)等を採用することができる。
【００９４】
ADRCを用いる方法では、クラスタップを構成する量子化ＤＣＴ係数が、ADRC処理され、その結果得られるADRCコードにしたがって、注目高画質ブロックのクラスが決定される。
【００９５】
なお、KビットADRCにおいては、例えば、クラスタップを構成する量子化ＤＣＴ係数の最大値MAXと最小値MINが検出され、DR=MAX-MINを、集合の局所的なダイナミックレンジとし、このダイナミックレンジDRに基づいて、クラスタップを構成する量子化ＤＣＴ係数がKビットに再量子化される。即ち、クラスタップを構成する量子化ＤＣＴ係数の中から、最小値MINが減算され、その減算値がDR/2^Kで除算（量子化）される。そして、以上のようにして得られる、クラスタップを構成するKビットの各量子化ＤＣＴ係数を、所定の順番で並べたビット列が、ADRCコードとして出力される。従って、クラスタップが、例えば、１ビットADRC処理された場合には、そのクラスタップを構成する各量子化ＤＣＴ係数は、最小値MINが減算された後に、最大値MAXと最小値MINとの平均値で除算され、これにより、各量子化ＤＣＴ係数が１ビットとされる（２値化される）。そして、その１ビットの量子化ＤＣＴ係数を所定の順番で並べたビット列が、ADRCコードとして出力される。
【００９６】
なお、クラス分類回路４３には、例えば、クラスタップを構成する量子化ＤＣＴ係数のレベル分布のパターンを、そのままクラスコードとして出力させることも可能であるが、この場合、クラスタップが、Ｎ個の量子化ＤＣＴ係数で構成され、各量子化ＤＣＴ係数に、Ｋビットが割り当てられているとすると、クラス分類回路４３が出力するクラスコードの場合の数は、（２^N）^K通りとなり、量子化ＤＣＴ係数のビット数Ｋに指数的に比例した膨大な数となる。
【００９７】
従って、クラス分類回路４３においては、クラスタップの情報量を、上述のADRC処理や、あるいはベクトル量子化等によって圧縮してから、クラス分類を行うのが好ましい。
【００９８】
ところで、本実施の形態では、クラスタップは、上述したように、６４個の量子化ＤＣＴ係数で構成される。従って、例えば、仮に、クラスタップを１ビットADRC処理することにより、クラス分類を行うこととしても、クラスコードの場合の数は、２⁶⁴通りという大きな値となる。
【００９９】
そこで、本実施の形態では、クラス分類回路４３において、クラスタップを構成する量子化ＤＣＴ係数から、重要性の高い特徴量を抽出し、その特徴量に基づいてクラス分類を行うことで、クラス数を低減するようになっている。
【０１００】
即ち、図８は、図６のクラス分類回路４３の構成例を示している。
【０１０１】
クラスタップは、電力演算回路５１に供給されるようになっており、電力演算回路５１は、クラスタップを構成する量子化ＤＣＴ係数を、幾つかの空間周波数帯域のものに分け、各周波数帯域の電力を演算する。
【０１０２】
即ち、電力演算回路５１は、クラスタップを構成する８×８個の量子化ＤＣＴ係数を、例えば、図９に示すような４つの空間周波数帯域Ｓ₀，Ｓ₁，Ｓ₂，Ｓ₃に分割する。
【０１０３】
ここで、クラスタップを構成する８×８個の量子化ＤＣＴ係数それぞれを、アルファベットｘに、図７に示したような、ラスタスキャン順に、０からのシーケンシャルな整数を付して表すこととすると、空間周波数帯域Ｓ₀は、４個の量子化ＤＣＴ係数ｘ₀，ｘ₁，ｘ₈，ｘ₉から構成され、空間周波数帯域Ｓ₁は、１２個の量子化ＤＣＴ係数ｘ₂，ｘ₃，ｘ₄，ｘ₅，ｘ₆，ｘ₇，ｘ₁₀，ｘ₁₁，ｘ₁₂，ｘ₁₃，ｘ₁₄，ｘ₁₅から構成される。また、空間周波数帯域Ｓ₂は、１２個の量子化ＤＣＴ係数ｘ₁₆，ｘ₁₇，ｘ₂₄，ｘ₂₅，ｘ₃₂，ｘ₃₃，ｘ₄₀，ｘ₄₁，ｘ₄₈，ｘ₄₉，ｘ₅₆，ｘ₅₇から構成され、空間周波数帯域Ｓ₃は、３６個の量子化ＤＣＴ係数ｘ₁₈，ｘ₁₉，ｘ₂₀，ｘ₂₁，ｘ₂₂，ｘ₂₃，ｘ₂₆，ｘ₂₇，ｘ₂₈，ｘ₂₉，ｘ₃₀，ｘ₃₁，ｘ₃₄，ｘ₃₅，ｘ₃₆，ｘ₃₇，ｘ₃₈，ｘ₃₉，ｘ₄₂，ｘ₄₃，ｘ₄₄，ｘ₄₅，ｘ₄₆，ｘ₄₇，ｘ₅₀，ｘ₅₁，ｘ₅₂，ｘ₅₃，ｘ₅₄，ｘ₅₅，ｘ₅₈，ｘ₅₉，ｘ₆₀，ｘ₆₁，ｘ₆₂，ｘ₆₃から構成される。
【０１０４】
さらに、電力演算回路５１は、空間周波数帯域Ｓ₀，Ｓ₁，Ｓ₂，Ｓ₃それぞれについて、量子化ＤＣＴ係数のＡＣ成分の電力Ｐ₀，Ｐ₁，Ｐ₂，Ｐ₃を演算し、クラスコード生成回路５２に出力する。
【０１０５】
即ち、電力演算回路５１は、空間周波数帯域Ｓ₀については、上述の４個の量子化ＤＣＴ係数ｘ₀，ｘ₁，ｘ₈，ｘ₉のうちのＡＣ成分ｘ₁，ｘ₈，ｘ₉の２乗和ｘ₁ ²＋ｘ₈ ²＋ｘ₉ ²を求め、これを、電力Ｐ₀として、クラスコード生成回路５２に出力する。また、電力演算回路５１は、空間周波数帯域Ｓ１についての、上述の１２個の量子化ＤＣＴ係数のＡＣ成分、即ち、１２個すべての量子化ＤＣＴ係数の２乗和を求め、これを、電力Ｐ₁として、クラスコード生成回路５２に出力する。さらに、電力演算回路５１は、空間周波数帯域Ｓ₂とＳ₃についても、空間周波数帯域Ｓ₁における場合と同様にして、それぞれの電力Ｐ₂とＰ₃を求め、クラスコード生成回路５２に出力する。
【０１０６】
クラスコード生成回路５２は、電力演算回路５１からの電力Ｐ₀，Ｐ₁，Ｐ₂，Ｐ₃を、閾値テーブル記憶部５３に記憶された、対応する閾値ＴＨ０，ＴＨ１，ＴＨ２，ＴＨ３とそれぞれ比較し、それぞれの大小関係に基づいて、クラスコードを出力する。即ち、クラスコード生成回路５２は、電力Ｐ₀と閾値ＴＨ０とを比較し、その大小関係を表す１ビットのコードを得る。同様に、クラスコード生成回路５２は、電力Ｐ₁と閾値ＴＨ１、電力Ｐ₂と閾値ＴＨ２、電力Ｐ₃と閾値ＴＨ３を、それぞれ比較することにより、それぞれについて、１ビットのコードを得る。そして、クラスコード生成回路５２は、以上のようにして得られる４つの１ビットのコードを、例えば、所定の順番で並べることにより得られる４ビットのコード（従って、０乃至１５のうちのいずれかの値）を、注目高画質ブロックのクラスを表すクラスコードとして出力する。従って、本実施の形態では、注目高画質ブロックは、２⁴（＝１６）個のクラスのうちのいずれかにクラス分類されることになる。
【０１０７】
閾値テーブル記憶部５３は、空間周波数帯域Ｓ₀乃至Ｓ₃の電力Ｐ₀乃至Ｐ₃それぞれと比較する閾値ＴＨ０乃至ＴＨ３を記憶している。
【０１０８】
なお、上述の場合には、クラス分類処理に、量子化ＤＣＴ係数のＤＣ成分ｘ₀が用いられないが、このＤＣ成分ｘ₀をも用いてクラス分類処理を行うことも可能である。
【０１０９】
図６に戻り、以上のようなクラス分類回路４３が出力するクラスコードは、係数テーブル記憶部４４に、アドレスとして与えられる。
【０１１０】
係数テーブル記憶部４４は、後述するような学習処理が行われることにより得られるタップ係数が登録された係数テーブルを記憶しており、クラス分類回路４３が出力するクラスコードに対応するアドレスに記憶されているタップ係数を積和演算回路４５に出力する。
【０１１１】
ここで、本実施の形態では、注目高画質ブロックについて、１つのクラスコードが得られる。一方、高画質ブロックは、本実施の形態では、１６×１６画素の２５６画素で構成されるから、注目高画質ブロックについては、それを構成する２５６画素それぞれを復号するための２５６セットのタップ係数が必要である。従って、係数テーブル記憶部４４には、１つのクラスコードに対応するアドレスに対して、２５６セットのタップ係数が記憶されている。
【０１１２】
積和演算回路４５は、予測タップ抽出回路４１が出力する予測タップと、係数テーブル記憶部４４が出力するタップ係数とを取得し、その予測タップとタップ係数とを用いて、式（１）に示した線形予測演算（積和演算）を行い、その結果得られる注目高画質ブロックの１６×１６画素の画素値（の予測値）を、対応するＤＣＴブロックの復号結果として、ブロック分解回路３３（図３）に出力する。
【０１１３】
ここで、予測タップ抽出回路４１においては、上述したように、注目高画質ブロックの各画素が、順次、注目画素とされるが、積和演算回路４５は、注目高画質ブロックの、注目画素となっている画素の位置に対応した動作モード（以下、適宜、画素位置モードという）となって、処理を行う。
【０１１４】
即ち、例えば、注目高画質ブロックの画素のうち、ラスタスキャン順で、ｉ番目の画素を、ｐ_iと表し、画素ｐ_iが、注目画素となっている場合、積和演算回路４５は、画素位置モード＃ｉの処理を行う。
【０１１５】
具体的には、上述したように、係数テーブル記憶部４４は、注目高画質ブロックを構成する２５６画素それぞれを復号するための２５６セットのタップ係数を出力するが、そのうちの画素ｐ_iを復号するためのタップ係数のセットをＷ_iと表すと、積和演算回路４５は、動作モードが、画素位置モード＃ｉのときには、予測タップと、２５６セットのタップ係数のうちのセットＷ_iとを用いて、式（１）の積和演算を行い、その積和演算結果を、画素ｐ_iの復号結果とする。
【０１１６】
次に、図１０のフローチャートを参照して、図６の係数変換回路３２の処理について説明する。
【０１１７】
エントロピー復号回路３１（図３）が出力するブロックごとの量子化ＤＣＴ係数は、予測タップ抽出回路４１およびクラスタップ抽出回路４２において順次受信され、予測タップ抽出回路４１は、そこに供給される量子化ＤＣＴ係数のブロック（ＤＣＴブロック）に対応する高画質ブロックを、順次、注目高画質ブロックとする。
【０１１８】
そして、クラスタップ抽出回路４２は、ステップＳ１１において、そこで受信した量子化ＤＣＴ係数の中から、注目高画質ブロックをクラス分類するのに用いるものを抽出して、クラスタップを構成し、クラス分類回路４３に供給する。
【０１１９】
クラス分類回路４３は、ステップＳ１２において、クラスタップ抽出回路４２からのクラスタップを用いて、注目高画質ブロックをクラス分類し、その結果得られるクラスコードを、係数テーブル記憶部４４に出力する。
【０１２０】
即ち、ステップＳ１２では、図１１のフローチャートに示すように、まず最初に、ステップＳ２１において、クラス分類回路４３（図８）の電力演算回路５１が、クラスタップを構成する８×８個の量子化ＤＣＴ係数を、図９に示した４つの空間周波数帯域Ｓ₀乃至Ｓ₃に分割し、それぞれの電力Ｐ₀乃至Ｐ₃を演算する。この電力Ｐ₀乃至Ｐ₃は、電力演算回路５１からクラスコード生成回路５２に出力される。
【０１２１】
クラスコード生成回路５２は、ステップＳ２２において、閾値テーブル記憶部５３から閾値ＴＨ０乃至ＴＨ３を読み出し、電力演算回路５１からの電力Ｐ₀乃至Ｐ₃それぞれと、閾値ＴＨ０乃至ＴＨ３それぞれとを比較し、それぞれの大小関係に基づいたクラスコードを生成して、リターンする。
【０１２２】
図１０に戻り、ステップＳ１２において以上のようにして得られるクラスコードは、クラス分類回路４３から係数テーブル記憶部４４に対して、アドレスとして与えられる。
【０１２３】
係数テーブル記憶部４４は、クラス分類回路４３からのアドレスとしてのクラスコードを受信すると、ステップＳ１３において、そのアドレスに記憶されている２５６セットのタップ係数（クラスコードのクラスに対応する２５６セットのタップ係数）を読み出し、積和演算回路４５に出力する。
【０１２４】
そして、ステップＳ１４に進み、予測タップ抽出回路４１は、注目高画質ブロックの画素のうち、ラスタスキャン順で、まだ、注目画素とされていない画素を、注目画素として、その注目画素の画素値を予測するのに用いる量子化ＤＣＴ係数を抽出し、予測タップとして構成する。この予測タップは、予測タップ抽出回路４１から積和演算回路４５に供給される。
【０１２５】
ここで、本実施の形態では、各高画質ブロックごとに、その高画質ブロックのすべての画素について、同一の予測タップが構成されるので、実際には、ステップＳ１４の処理は、注目高画質ブロックについて、最初に注目画素とされる画素に対してだけ行えば、残りの２５５画素に対しては、行う必要がない。
【０１２６】
積和演算回路４５は、ステップＳ１５において、ステップＳ１３で係数テーブル記憶部４４が出力する２５６セットのタップ係数のうち、注目画素に対する画素位置モードに対応するタップ係数のセットを取得し、そのタップ係数のセットと、ステップＳ１４で予測タップ抽出回路４１から供給される予測タップとを用いて、式（１）に示した積和演算を行い、注目画素の画素値の復号値を得る。
【０１２７】
そして、ステップＳ１６に進み、予測タップ抽出回路４１は、注目高画質ブロックのすべての画素を、注目画素として処理を行ったかどうかを判定する。ステップＳ１６において、注目高画質ブロックのすべての画素を、注目画素として、まだ処理を行っていないと判定された場合、ステップＳ１４に戻り、予測タップ抽出回路４１は、注目高画質ブロックの画素のうち、ラスタスキャン順で、まだ、注目画素とされていない画素を、新たに注目画素として、以下、同様の処理を繰り返す。
【０１２８】
また、ステップＳ１６において、注目高画質ブロックのすべての画素を、注目画素として処理を行ったと判定された場合、即ち、注目高画質ブロックのすべての画素の復号値（８×８の量子化ＤＣＴ係数を、８×８画素に復号し、さらに、その８×８画素を、１６×１６画素に高画質化したもの）が得られた場合、積和演算回路４５は、その復号値で構成される高画質ブロックを、ブロック分解回路３３（図３）に出力し、処理を終了する。
【０１２９】
なお、図１０のフローチャートにしたがった処理は、予測タップ抽出回路４１が、新たな注目高画質ブロックを設定するごとに繰り返し行われる。
【０１３０】
次に、図１２は、図６の係数テーブル記憶部４４に記憶させるタップ係数の学習処理を行う学習装置の一実施の形態の構成例を示している。
【０１３１】
間引き回路６０には、１枚以上の学習用の画像データが、学習時の教師となる教師データとして供給されるようになっており、間引き回路６０は、その教師データとしての画像について、図６の係数変換回路３２における積和演算回路４５がタップ係数を用いた積和演算を行うことにより施す向上処理に基づく処理を施す。即ち、ここでは、向上処理は、８×８画素を、その横および縦の空間解像度を２倍にした１６×１６画素の高画質のもの（解像度を向上させたもの）に変換する処理であるから、間引き回路６０は、教師データとしての画像データの画素を間引き、その横および縦の画素数を、いずれも１／２にした画像データ（以下、適宜、準教師データという）を生成する。
【０１３２】
なお、準教師データとしての画像データは、エンコーダ２１（図１）においてＪＰＥＧ符号化の対象とされる画像データと同一画質（解像度）のものであり、例えば、いま、このＪＰＥＧ符号化の対象とされる画像を、ＳＤ(Standard Density)画像とすると、教師データとする画像としては、そのＳＤ画像の横および縦の画素数を、いずれも２倍にしたＨＤ(High Density)画像を用いる必要がある。
【０１３３】
ブロック化回路６１は、間引き回路６０が生成する１枚以上の準教師データとしてのＳＤ画像を、ＪＰＥＧ符号化における場合と同様に、８×８画素の画素ブロックにブロック化する。
【０１３４】
ＤＣＴ回路６２は、ブロック化回路６１がブロック化した画素ブロックを、順次読み出し、その画素ブロックを、ＤＣＴ処理することで、ＤＣＴ係数のブロックとする。このＤＣＴ係数のブロックは、量子化回路６３に供給される。
【０１３５】
量子化回路６３は、ＤＣＴ回路６２からのＤＣＴ係数のブロックを、エンコーダ２１（図２）におけるＪＰＥＧ符号化に用いられるのと同一の量子化テーブルにしたがって量子化し、その結果得られる量子化ＤＣＴ係数のブロック（ＤＣＴブロック）を、予測タップ抽出回路６４およびクラスタップ抽出回路６５に順次供給する。
【０１３６】
予測タップ抽出回路６４は、後述する正規方程式加算回路６７が注目高画質ブロックとする高画質ブロックを構成する１６×１６画素のうちの注目画素となっている画素について、図６の予測タップ抽出回路４１が構成するのと同一の予測タップを、量子化回路６３の出力から、必要な量子化ＤＣＴ係数を抽出することで構成する。この予測タップは、学習時の生徒となる生徒データとして、予測タップ抽出回路６４から正規方程式加算回路６７に供給される。
【０１３７】
クラスタップ抽出回路６５は、注目高画質ブロックについて、図６のクラスタップ抽出回路４２が構成するのと同一のクラスタップを、量子化回路６３の出力から、必要な量子化ＤＣＴ係数を抽出することで構成する。このクラスタップは、クラスタップ抽出回路６５からクラス分類回路６６に供給される。
【０１３８】
クラス分類回路６６は、クラスタップ抽出回路６５からのクラスタップを用いて、図６のクラス分類回路４３と同一の処理を行うことで、注目高画質ブロックをクラス分類し、その結果得られるクラスコードを、正規方程式加算回路６７に供給する。
【０１３９】
正規方程式加算回路６７には、間引き回路６０に教師データとして供給されるのと同一のＨＤ画像が供給されるようになっており、正規方程式加算回路６７は、そのＨＤ画像を、１６×１６画素の高画質ブロックにブロック化し、その高画質ブロックを、順次、注目高画質ブロックとする。さらに、正規方程式加算回路６７は、注目高画質ブロックを構成する１６×１６画素のうち、例えば、ラスタスキャン順で、まだ注目画素とされていないものを、順次、注目画素とし、その注目画素（の画素値）と、予測タップ構成回路６４からの予測タップ（を構成する量子化ＤＣＴ係数）を対象とした足し込みを行う。
【０１４０】
即ち、正規方程式加算回路６７は、クラス分類回路６６から供給されるクラスコードに対応するクラスごとに、予測タップ（生徒データ）を用い、式（８）の行列Ａにおける各コンポーネントとなっている、生徒データどうしの乗算（ｘ_inｘ_im）と、サメーション（Σ）に相当する演算を行う。
【０１４１】
さらに、正規方程式加算回路６７は、やはり、クラス分類回路６６から供給されるクラスコードに対応するクラスごとに、予測タップ（生徒データ）および注目画素（教師データ）を用い、式（８）のベクトルｖにおける各コンポーネントとなっている、生徒データと教師データの乗算（ｘ_inｙ_i）と、サメーション（Σ）に相当する演算を行う。
【０１４２】
なお、正規方程式加算回路６７における、上述のような足し込みは、各クラスについて、注目画素に対する画素位置モードごとに行われる。
【０１４３】
正規方程式加算回路６７は、以上の足し込みを、そこに供給される教師データとしてのＨＤ画像を構成する画素すべてを注目画素として行い、これにより、各クラスについて、画素位置モードごとに、式（８）に示した正規方程式をたてる。
【０１４４】
タップ係数決定回路６８は、正規方程式加算回路６７においてクラスごとに（かつ、画素位置モードごとに）生成された正規方程式を解くことにより、クラスごとに、２５６セットのタップ係数を求め、係数テーブル記憶部６９の、各クラスに対応するアドレスに供給する。
【０１４５】
なお、学習用の画像として用意する画像の枚数や、その画像の内容等によっては、正規方程式加算回路６７において、タップ係数を求めるのに必要な数の正規方程式が得られないクラスが生じる場合があり得るが、タップ係数決定回路６８は、そのようなクラスについては、例えば、デフォルトのタップ係数を出力する。
【０１４６】
係数テーブル記憶部６９は、タップ係数決定回路６８から供給されるクラスごとの２５６セットのタップ係数を記憶する。
【０１４７】
次に、図１３のフローチャートを参照して、図１２の学習装置の処理（学習処理）について説明する。
【０１４８】
間引き回路６０には、学習用の画像データであるＨＤ画像が、教師データとして供給され、間引き回路６０は、ステップＳ３０において、その教師データとしてのＨＤ画像の画素を間引き、その横および縦の画素数を、いずれも１／２にした準教師データとしてのＳＤ画像を生成する。
【０１４９】
そして、ブロック化回路６１は、ステップＳ３１において、間引き回路６０で得られた準教師データとしてのＳＤ画像を、エンコーダ２１（図２）によるＪＰＥＧ符号化における場合と同様に、８×８画素の画素ブロックにブロック化して、ステップＳ３２に進む。ステップＳ３２では、ＤＣＴ回路６２が、ブロック化回路６１がブロック化した画素ブロックを、順次読み出し、その画素ブロックを、ＤＣＴ処理することで、ＤＣＴ係数のブロックとし、ステップＳ３３に進む。ステップＳ３３では、量子化回路６３が、ＤＣＴ回路６２において得られたＤＣＴ係数のブロックを順次読み出し、エンコーダ２１におけるＪＰＥＧ符号化に用いられるのと同一の量子化テーブルにしたがって量子化して、量子化ＤＣＴ係数で構成されるブロック（ＤＣＴブロック）とする。
【０１５０】
一方、正規方程式加算回路６７にも、教師データとしてのＨＤ画像が供給され、正規方程式加算回路６７は、そのＨＤ画像を、１６×１６画素の高画質ブロックにブロック化し、ステップＳ３４において、その高画質ブロックのうち、まだ、注目高画質ブロックとされていないものを、注目高画質ブロックとする。さらに、ステップＳ３４では、クラスタップ抽出回路６５が、ブロック化回路６１でブロック化された画素ブロックのうち、注目高画質ブロックをクラス分類するのに用いる量子化ＤＣＴ係数を、量子化回路６３で得られたＤＣＴブロックから抽出して、クラスタップを構成し、クラス分類回路６６に供給する。クラス分類回路６６は、ステップＳ３５において、図１１のフローチャートで説明した場合と同様に、クラスタップ抽出回路６５からのクラスタップを用いて、注目高画質ブロックをクラス分類し、その結果得られるクラスコードを、正規方程式加算回路６７に供給して、ステップＳ３６に進む。
【０１５１】
ステップＳ３６では、正規方程式加算回路６７が、注目高画質ブロックの画素のうち、ラスタスキャン順で、まだ、注目画素とされていない画素を、注目画素とし、予測タップ抽出回路６４が、その注目画素について、図６の予測タップ抽出回路４１が構成するのと同一の予測タップを、量子化回路６３の出力から必要な量子化ＤＣＴ係数を抽出することで構成する。そして、予測タップ抽出回路６４は、注目画素についての予測タップを、生徒データとして、正規方程式加算回路６７に供給し、ステップＳ３７に進む。
【０１５２】
ステップＳ３７では、正規方程式加算回路６７は、教師データとしての注目画素と、生徒データとしての予測タップ（を構成する量子化ＤＣＴ係数）を対象として、式（８）の行列Ａとベクトルｖの、上述したような足し込みを行う。なお、この足し込みは、クラス分類回路６６からのクラスコードに対応するクラスごとに、かつ注目画素に対する画素位置モードごとに行われる。
【０１５３】
そして、ステップＳ３８に進み、正規方程式加算回路６７は、注目高画質ブロックのすべての画素を、注目画素として、足し込みを行ったかどうかを判定する。ステップＳ３８において、注目高画質ブロックのすべての画素を、注目画素として、まだ足し込みを行っていないと判定された場合、ステップＳ３６に戻り、正規方程式加算回路６７は、注目高画質ブロックの画素のうち、ラスタスキャン順で、まだ、注目画素とされていない画素を、新たに注目画素とし、以下、同様の処理を繰り返す。
【０１５４】
また、ステップＳ３８において、注目高画質ブロックのすべての画素を、注目画素として、足し込みを行ったと判定された場合、ステップＳ３９に進み、正規方程式加算回路６７は、教師データとしての画像から得られたすべての高画質ブロックを、注目高画質ブロックとして処理を行ったかどうかを判定する。ステップＳ３９において、教師データとしての画像から得られたすべての高画質ブロックを、注目高画質ブロックとして、まだ処理を行っていないと判定された場合、ステップＳ３４に戻り、まだ注目高画質ブロックとされていない高画質ブロックが、新たに注目高画質ブロックとされ、以下、同様の処理が繰り返される。
【０１５５】
一方、ステップＳ３９において、教師データとしての画像から得られたすべての高画質ブロックを、注目高画質ブロックとして処理を行ったと判定された場合、即ち、正規方程式加算回路６７において、各クラスについて、画素位置モードごとの正規方程式が得られた場合、ステップＳ４０に進み、タップ係数決定回路６８は、各クラスの画素位置モードごとに生成された正規方程式を解くことにより、各クラスごとに、そのクラスの２５６の画素位置モードそれぞれに対応する２５６セットのタップ係数を求め、係数テーブル記憶部６９の、各クラスに対応するアドレスに供給して記憶させ、処理を終了する。
【０１５６】
以上のようにして、係数テーブル記憶部６９に記憶された各クラスごとのタップ係数が、図６の係数テーブル記憶部４４に記憶されている。
【０１５７】
従って、係数テーブル記憶部４４に記憶されたタップ係数は、線形予測演算を行うことにより得られる元の画素値の予測値の予測誤差（ここでは、自乗誤差）が、統計的に最小になるように学習を行うことにより求められたものであり、その結果、図６の係数変換回路３２によれば、ＪＰＥＧ符号化された画像を、教師データとして用いたＨＤ画像の画質に限りなく近い高画質の画像に復号することができる。
【０１５８】
さらに、係数変換回路３２によれば、上述したように、ＪＰＥＧ符号化された画像の復号処理と、その画質を向上させるための向上処理とが、同時に施されることとなるので、ＪＰＥＧ符号化された画像から、効率的に、高画質化された復号画像を得ることができる。
【０１５９】
次に、図１４は、図３の係数変換回路３２の第２の構成例を示している。なお、図中、図６における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図１４の係数変換回路３２は、逆量子化回路７１が新たに設けられている他は、基本的に、図６における場合と同様に構成されている。
【０１６０】
図１４の実施の形態において、逆量子化回路７１には、エントロピー復号回路３１（図３）において符号化データをエントロピー復号することにより得られるブロックごとの量子化ＤＣＴ係数が供給される。
【０１６１】
なお、エントロピー復号回路３１においては、上述したように、符号化データから、量子化ＤＣＴ係数の他、量子化テーブルも得られるが、図１４の実施の形態では、この量子化テーブルも、エントロピー復号回路３１から逆量子化回路７１に供給されるようになっている。
【０１６２】
逆量子化回路７１は、エントロピー復号回路３１からの量子化ＤＣＴ係数を、同じくエントロピー復号回路３１からの量子化テーブルにしたがって逆量子化し、その結果得られるＤＣＴ係数を、予測タップ抽出回路４１およびクラスタップ抽出回路４２に供給する。
【０１６３】
従って、予測タップ抽出回路４１とクラスタップ抽出回路４２では、量子化ＤＣＴ係数ではなく、ＤＣＴ係数を対象として、予測タップとクラスタップがそれぞれ構成され、以降も、ＤＣＴ係数を対象として、図６における場合と同様の処理が行われる。
【０１６４】
このように、図１４の実施の形態では、量子化ＤＣＴ係数ではなく、ＤＣＴ係数を対象として処理が行われるため、係数テーブル記憶部４４に記憶させるタップ係数は、図６における場合と異なるものとする必要がある。
【０１６５】
そこで、図１５は、図１４の係数テーブル記憶部４４に記憶させるタップ係数の学習処理を行う学習装置の一実施の形態の構成例を示している。なお、図中、図１２における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図１５の学習装置は、量子化回路６３の後段に、逆量子化回路８１が新たに設けられている他は、図１２における場合と基本的に同様に構成されている。
【０１６６】
図１５の実施の形態において、逆量子化回路８１は、逆量子化回路６３が出力する量子化ＤＣＴ係数を、図１４の逆量子化回路７１と同様に逆量子化し、その結果得られるＤＣＴ係数を、予測タップ抽出回路６４およびクラスタップ抽出回路６５に供給する。
【０１６７】
従って、予測タップ抽出回路６４とクラスタップ抽出回路６５では、量子化ＤＣＴ係数ではなく、ＤＣＴ係数を対象として、予測タップとクラスタップがそれぞれ構成され、以降も、ＤＣＴ係数を対象として、図１２における場合と同様の処理が行われる。
【０１６８】
その結果、ＤＣＴ係数が量子化され、さらに逆量子化されることにより生じる量子化誤差の影響を低減するタップ係数が得られることになる。
【０１６９】
次に、図１６は、図３の係数変換回路３２の第３の構成例を示している。なお、図中、図６における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図１６の係数変換回路３２は、クラスタップ抽出回路４２およびクラス分類回路４３が設けられていない他は、基本的に、図６における場合と同様に構成されている。
【０１７０】
従って、図１６の実施の形態では、クラスという概念がないが、このことは、クラスが１つであるとも考えるから、係数テーブル記憶部４４には、１クラスのタップ係数だけが記憶されており、これを用いて処理が行われる。
【０１７１】
このように、図１６の実施の形態では、係数テーブル記憶部４４に記憶されているタップ係数は、図６における場合と異なるものとなっている。
【０１７２】
そこで、図１７は、図１６の係数テーブル記憶部４４に記憶させるタップ係数の学習処理を行う学習装置の一実施の形態の構成例を示している。なお、図中、図１２における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図１７の学習装置は、クラスタップ抽出回路６５およびクラス分類回路６６が設けられていない他は、図１２における場合と基本的に同様に構成されている。
【０１７３】
従って、図１７の学習装置では、正規方程式加算回路６７において、上述の足し込みが、クラスには無関係に、画素位置モード別に行われる。そして、タップ係数決定回路６８において、画素位置モードごとに生成された正規方程式を解くことにより、タップ係数が求められる。
【０１７４】
次に、図１８は、図３の係数変換回路３２の第４の構成例を示している。なお、図中、図６または図１４における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図１８の係数変換回路３２は、クラスタップ抽出回路４２およびクラス分類回路４３が設けられておらず、かつ逆量子化回路７１が新たに設けられている他は、基本的に、図６における場合と同様に構成されている。
【０１７５】
従って、図１８の実施の形態では、上述の図１６の実施の形態における場合と同様に、係数テーブル記憶部４４には、１クラスのタップ係数だけが記憶されており、これを用いて処理が行われる。
【０１７６】
さらに、図１８の実施の形態では、図１４の実施の形態における場合と同様に、予測タップ抽出回路４１において、量子化ＤＣＴ係数ではなく、逆量子化回路７１が出力するＤＣＴ係数を対象として、予測タップが構成され、以降も、ＤＣＴ係数を対象として、処理が行われる。
【０１７７】
従って、図１８の実施の形態でも、係数テーブル記憶部４４に記憶されているタップ係数は、図６における場合と異なるものとなっている。
【０１７８】
そこで、図１９は、図１８の係数テーブル記憶部４４に記憶させるタップ係数の学習処理を行う学習装置の一実施の形態の構成例を示している。なお、図中、図１２または図１５における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図１９の学習装置は、クラスタップ抽出回路６５およびクラス分類回路６６が設けられておらず、かつ逆量子化回路８１が新たに設けられている他は、図１２における場合と基本的に同様に構成されている。
【０１７９】
従って、図１９の学習装置では、予測タップ抽出回路６４において、量子化ＤＣＴ係数ではなく、ＤＣＴ係数を対象として、予測タップが構成され、以降も、ＤＣＴ係数を対象として処理が行われる。さらに、正規方程式加算回路６７において、上述の足し込みが、クラスには無関係に行われ、タップ係数決定回路６８において、クラスと無関係に生成された正規方程式を解くことにより、タップ係数が求められる。
【０１８０】
次に、以上においては、静止画を圧縮符号化するＪＰＥＧ符号化された画像を対象としたが、本発明は、動画を圧縮符号化する、例えば、ＭＰＥＧ符号化された画像を対象とすることも可能である。
【０１８１】
即ち、図２０は、ＭＰＥＧ符号化が行われる場合の、図２のエンコーダ２１の構成例を示している。
【０１８２】
ＭＰＥＧ符号化の対象である動画を構成するフレーム（またはフィールド）は、順次、動き検出回路９１と演算器９２に供給される。
【０１８３】
動き検出回路９１は、そこに供給されるフレームについて、１６×１６画素のマクロブロック単位で、動きベクトルを検出し、エントロピー符号化回路９６および動き補償回路１００に供給する。
【０１８４】
演算器９２は、そこに供給される画像が、Ｉ(Intra)ピクチャであれば、そのままブロック化回路９３に供給し、Ｐ(Predictive)またはＢ(Bidirectionally predictive)ピクチャであれば、動き補償回路１００から供給される参照画像との差分を演算して、その差分値を、ブロック化回路９３に供給する。
【０１８５】
ブロック化回路９３は、演算器９２の出力を、８×８画素の画素ブロックにブロック化し、ＤＣＴ回路９４に供給する。ＤＣＴ回路９４は、ブロック化回路９３からの画素ブロックをＤＣＴ処理し、その結果得られるＤＣＴ係数を、量子化回路９５に供給する。量子化回路９５は、ＤＣＴ回路９３からのブロック単位のＤＣＴ係数を所定の量子化ステップで量子化し、その結果得られる量子化ＤＣＴ係数をエントロピー符号化回路９６に供給する。エントロピー符号化回路９６は、量子化回路９５からの量子化ＤＣＴ係数をエントロピー符号化し、動き検出回路９１からの動きベクトルや、その他の必要な情報を付加して、その結果得られる符号化データ（例えば、ＭＰＥＧトランスポートストリーム）を、ＭＰＥＧ符号化結果として出力する。
【０１８６】
量子化回路９５が出力する量子化ＤＣＴ係数のうち、ＩピクチャおよびＰピクチャは、後で符号化されるＰピクチャやＢピクチャの参照画像として用いるのにローカルデコードする必要があるため、エントロピー符号化回路９６の他、逆量子化回路９７にも供給される。
【０１８７】
逆量子化回路９７は、量子化回路９５からの量子化ＤＣＴ係数を逆量子化することにより、ＤＣＴ係数とし、逆ＤＣＴ回路９８に供給する。逆ＤＣＴ回路９８は、逆量子化回路９７からのＤＣＴ係数を逆ＤＣＴ処理し、演算器９９に出力する。演算器９９には、逆ＤＣＴ回路９８の出力の他、動き補償回路１００が出力する参照画像も供給されるようになっており、演算器９９は、逆ＤＣＴ回路９８の出力が、Ｐピクチャのものである場合には、その出力と、動き補償回路１００の出力とを加算することで、元の画像を復号し、動き補償回路１００に供給する。また、演算器９９は、逆ＤＣＴ回路９８の出力が、Ｉピクチャのものである場合には、その出力は、Ｉピクチャの復号画像となっているので、そのまま、動き補償回路１００に供給する。
【０１８８】
動き補償回路１００は、演算器９９から供給される、ローカルデコードされた画像に対して、動き検出回路９１からの動きベクトルにしたがった動き補償を施し、その動き補償後の画像を、参照画像として、演算器９２および９９に供給する。
【０１８９】
ここで、図２１は、以上のようなＭＰＥＧ符号化の結果得られる符号化データを復号する、従来のＭＰＥＧデコーダの一例の構成を示している。
【０１９０】
符号化データは、エントロピー復号回路１１１に供給され、エントロピー復号回路１１１は、符号化データをエントロピー復号し、量子化ＤＣＴ係数、動きベクトル、その他の情報を得る。そして、量子化ＤＣＴ係数は、逆量子化回路１１２に供給され、動きベクトルは、動き補償回路１１６に供給される。
【０１９１】
逆量子化回路１１２は、エントロピー復号回路１１１からの量子化ＤＣＴ係数を逆量子化することにより、ＤＣＴ係数とし、逆ＤＣＴ回路１１３に供給する。逆ＤＣＴ回路１１３は、逆量子化回路１１２からのＤＣＴ係数を逆ＤＣＴ処理し、演算器１１４に出力する。演算器１１４には、逆量子化回路１１３の出力の他、動き補償回路１１６が出力する、既に復号されたＩピクチャまたはＰピクチャを、エントロピー復号回路１１１からの動きベクトルにしたがって動き補償したものが参照画像として供給されるようになっており、演算器１１４は、逆ＤＣＴ回路１１３の出力が、ＰまたはＢピクチャのものである場合には、その出力と、動き補償回路１００の出力とを加算することで、元の画像を復号し、ブロック分解回路１１５に供給する。また、演算器１１４は、逆ＤＣＴ回路１１３の出力が、Ｉピクチャのものである場合には、その出力は、Ｉピクチャの復号画像となっているので、そのまま、ブロック分解回路１１５に供給する。
【０１９２】
ブロック分解回路１１５は、演算器１１４から画素ブロック単位で供給される復号画像のブロック化を解くことで、復号画像を得て出力する。
【０１９３】
一方、動き補償回路１１６は、演算器１１４が出力する復号画像のうちのＩピクチャとＰピクチャを受信し、エントロピー復号回路１１１からの動きベクトルにしたがった動き補償を施す。そして、動き補償回路１１６は、その動き補償後の画像を、参照画像として、演算器１１４に供給する。
【０１９４】
図３のデコーダ２２では、ＭＰＥＧ符号化された符号化データも、上述のように、効率的に、高画質の画像に復号することができる。
【０１９５】
即ち、符号化データは、エントロピー復号回路３１に供給され、エントロピー復号回路３１は、符号化データを、エントロピー復号する。このエントロピー復号の結果得られる量子化ＤＣＴ係数、動きベクトル、その他の情報は、エントロピー復号回路３１から係数変換回路３２に供給される。
【０１９６】
係数変換回路３２は、エントロピー復号回路３１からの量子化ＤＣＴ係数Ｑと、学習を行うことにより求められたタップ係数を用いて、所定の予測演算を行うとともに、エントロピー復号回路３１からの動きベクトルにしたがった動き補償を必要に応じて行うことにより、量子化ＤＣＴ係数を、高画質の画素値に復号し、その高画質の画素値でなる高画質ブロックをブロック分解回路３３に供給する。
【０１９７】
ブロック分解回路３３は、係数変換回路３２において得られた高画質ブロックのブロック化を解くことで、横および縦の画素数がいずれも、ＭＰＥＧ符号化された画像の、例えば２倍になった高画質の復号画像を得て出力する。
【０１９８】
次に、図２２は、デコーダ２２においてＭＰＥＧ符号化された符号化データを復号する場合の、図３の係数変換回路３２の構成例を示している。なお、図中、図１８または図２１における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。即ち、図２２の係数変換回路３２は、積和演算回路４５の後段に、図２１における演算器１１４および動き補償回路１１６が設けられている他は、図１８における場合と基本的に同様に構成されている。
【０１９９】
従って、図２２の係数変換回路３２では、量子化ＤＣＴ係数が、逆量子化回路７１において逆量子化され、その結果得られるＤＣＴ係数を用いて、予測タップ抽出回路４１において予測タップが構成される。そして、積和演算回路４５が、その予測タップと、係数テーブル記憶部４４に記憶されたタップ係数とを用いた予測演算を行うことにより、横および縦の画素数がいずれも、元の画像の２倍になった高画質のデータを出力する。
【０２００】
そして、演算器１１４は、積和演算回路４５の出力を、必要に応じて、動き補償回路１１６の出力と加算することで、横および縦の画素数がいずれも、元の画像の２倍になった高画質の画像を復号し、ブロック分解回路３３（図３）に出力する。
【０２０１】
即ち、Ｉピクチャについては、積和演算回路４５の出力は、横および縦の画素数がいずれも、元の画像の２倍になった高画質の画像となっているので、演算器１１４は、積和演算回路４５の出力を、そのまま、ブロック分解回路３３に出力する。
【０２０２】
また、ＰまたはＢピクチャについては、積和演算回路４５の出力は、横および縦の画素数がいずれも、元の画像の２倍になった高画質の画像と、高画質の参照画像との差分となっているから、演算器１１４は、積和演算回路４５の出力を、動き補償回路１１６から供給される高画質の参照画像と加算することで、横および縦の画素数がいずれも、元の画像の２倍になった高画質の画像に復号し、ブロック分解回路３３に出力する。
【０２０３】
一方、動き補償回路１１６は、演算器１１４が出力する高画質の復号画像のうち、ＩおよびＰピクチャを受信し、そのＩまたはＰピクチャの高画質の復号画像に対して、エントロピー復号回路３１（図３）からの動きベクトルを用いた動き補償を施すことにより、高画質の参照画像を得て、演算器１１４に供給する。
【０２０４】
なお、ここでは、復号画像の横および縦の画素数が、いずれも、元の画像の２倍になっているので、動き補償回路１１６は、例えば、エントロピー復号回路３１からの動きベクトルの横方向および縦方向の大きさをいずれも２倍にした動きベクトルにしたがって動き補償を行う。
【０２０５】
次に、図２３は、図２２の係数テーブル記憶部４４に記憶させるタップ係数を学習する学習装置の一実施の形態の構成例を示している。なお、図中、図１９における場合と対応する部分については、同一の符号を付してあり、以下では、その説明は、適宜省略する。
【０２０６】
間引き回路１２０には、学習用のＨＤ画像が、教師データとして入力され、間引き回路１２０は、例えば、図１２の間引き回路６０と同様に、教師データとしてのＨＤ画像の画素を間引き、その横および縦の画素数を、いずれも１／２にしたＳＤ画像である準教師データを生成する。そして、この準教師データとしてのＳＤ画像は、動きベクトル検出回路１２１および演算器１２２に供給される。
【０２０７】
動きベクトル検出回路１２１、演算器１２２、ブロック化回路１２３、ＤＣＴ回路１２４、量子化回路１２５、逆量子化回路１２７、逆ＤＣＴ回路１２８、演算器１２９、または動き補償回路１３０は、図２０の動きベクトル検出回路９１、演算器９２、ブロック化回路９３、ＤＣＴ回路９４、量子化回路９５、逆量子化回路９７、逆ＤＣＴ回路９８、演算器９９、または動き補償回路１００とそれぞれ同様の処理を行い、これにより、量子化回路１２５からは、図２０の量子化回路９５が出力するのと同様の量子化ＤＣＴ係数が出力される。
【０２０８】
量子化回路１２５が出力する量子化ＤＣＴ係数は、逆量子化回路８１に供給され、逆量子化回路８１は、量子化回路１２５からの量子化ＤＣＴ係数を逆量子化し、ＤＣＴ係数に変換して、予測タップ抽出回路６４に供給する。予測タップ抽出回路６４は、逆量子化回路８１からのＤＣＴ係数から、予測タップを構成し、生徒データとして、正規方程式加算回路６７に供給する。
【０２０９】
一方、教師データとしてのＨＤ画像は、間引き回路１２０の他、演算器１３２にも供給されるようになっている。演算器１３２は、教師データとしてのＨＤ画像から、必要に応じて、補間回路１３１の出力を減算し、正規方程式加算回路６７に供給する。
【０２１０】
即ち、補間回路１３１は、動き補償回路１３０が出力するＳＤ画像の参照画像の横および縦の画素数を２倍にした高画質の参照画像を生成し、演算器１３２に供給する。
【０２１１】
演算器１３２は、そこに供給されるＨＤ画像がＩピクチャである場合には、そのＩピクチャのＨＤ画像を、そのまま、教師データとして、正規方程式加算回路６７に供給する。また、演算器１３２は、そこに供給されるＨＤ画像がＰまたはＢピクチャである場合には、そのＰまたはＢピクチャのＨＤ画像と、補間回路１３１が出力する高画質の参照画像との差分を演算することにより、演算器１２２が出力するＳＤ画像（準教師データ）についての差分を高画質化したものを得て、これを、教師データとして、正規方程式加算回路６７に出力する。
【０２１２】
なお、補間回路１３１では、例えば、単純な補間により画素数を増加させることが可能である。また、補間回路１３１では、例えば、クラス分類適応処理により画素数を増加させることも可能である。さらに、演算器１３２では、教師データとしてのＨＤ画像をＭＰＥＧ符号化し、そのローカルデコードを行って動き補償したものを、参照画像として用いるようにすることが可能である。
【０２１３】
正規方程式加算回路６７は、演算器１３２の出力を教師データとするとともに、逆量子化回路８１からの予測タップを生徒データとして、上述したような足し込みを行い、これにより、正規方程式を生成する。
【０２１４】
そして、タップ係数決定回路６８は、正規方程式加算回路６７で生成された正規方程式を解くことにより、タップ係数を求め、係数テーブル記憶部６９に供給して記憶させる。
【０２１５】
図２２の積和演算回路４５では、このようにして求められたタップ係数を用いて、ＭＰＥＧ符号化された符号化データが復号されるので、やはり、ＭＰＥＧ符号化された画像の復号処理と、その画質を向上させるための処理とを、同時に施すことができ、従って、ＭＰＥＧ符号化された画像から、効率的に、高画質の、即ち、本実施の形態では、横および縦の画素数がいずれも２倍になったＨＤ画像である復号画像を得ることができる。
【０２１６】
なお、図２２の係数変換回路３２は、逆量子化回路７１を設けずに構成することが可能である。この場合、図２３の学習装置は、逆量子化回路８１を設けずに構成すれば良い。
【０２１７】
また、図２２の係数変換回路３２は、図６における場合と同様に、クラスタップ抽出回路４２およびクラス分類回路４３を設けて構成することが可能である。
この場合、図２３の学習装置は、図１２における場合のように、クラスタップ抽出回路６５およびクラス分類回路６６を設けて構成すれば良い。
【０２１８】
さらに、上述の場合には、デコーダ２２（図３）において、元の画像の空間解像度を２倍に向上させた復号画像を得るようにしたが、デコーダ２２では、元の画像の空間解像度を任意の倍数にした復号画像や、さらには、元の画像の時間解像度を向上させた復号画像を得るようにすることも可能である。
【０２１９】
即ち、例えば、ＭＰＥＧ符号化する対象の画像が、図２４（Ａ）に示すような時間解像度が低いものである場合に、デコーダ２２では、その画像をＭＰＥＧ符号化した符号化データを、図２４（Ｂ）に示すような、元の画像の時間解像度を２倍にした画像に復号するようにすることが可能である。さらには、例えば、ＭＰＥＧ符号化する対象の画像が、図２５（Ａ）に示すような、映画で用いられる２４フレーム／秒の画像である場合に、デコーダ２２では、その画像をＭＰＥＧ符号化した符号化データを、図２５（Ｂ）に示すような、元の画像の時間解像度を６０／２４倍にした、６０フレーム／秒の画像に復号するようにすることが可能である。この場合、いわゆる２−３プルダウンを容易に行うことができる。
【０２２０】
ここで、上述のように、デコーダ２２において、時間解像度を向上させる場合には、予測タップやクラスタップは、例えば、図２６に示すように、２以上のフレームのＤＣＴ係数から構成するようにすることが可能である。
【０２２１】
また、デコーダ２２では、空間解像度または時間解像度のうちのいずれか一方だけではなく、両方を向上させた復号画像を得るようにすることも可能である。
【０２２２】
次に、上述した一連の処理は、ハードウェアにより行うこともできるし、ソフトウェアにより行うこともできる。一連の処理をソフトウェアによって行う場合には、そのソフトウェアを構成するプログラムが、汎用のコンピュータ等にインストールされる。
【０２２３】
そこで、図２７は、上述した一連の処理を実行するプログラムがインストールされるコンピュータの一実施の形態の構成例を示している。
【０２２４】
プログラムは、コンピュータに内蔵されている記録媒体としてのハードディスク２０５やＲＯＭ２０３に予め記録しておくことができる。
【０２２５】
あるいはまた、プログラムは、フロッピーディスク、CD-ROM(Compact Disc Read Only Memory)，MO(Magneto optical)ディスク，DVD(Digital Versatile Disc)、磁気ディスク、半導体メモリなどのリムーバブル記録媒体２１１に、一時的あるいは永続的に格納（記録）しておくことができる。このようなリムーバブル記録媒体２１１は、いわゆるパッケージソフトウエアとして提供することができる。
【０２２６】
なお、プログラムは、上述したようなリムーバブル記録媒体２１１からコンピュータにインストールする他、ダウンロードサイトから、ディジタル衛星放送用の人工衛星を介して、コンピュータに無線で転送したり、LAN(Local Area Network)、インターネットといったネットワークを介して、コンピュータに有線で転送し、コンピュータでは、そのようにして転送されてくるプログラムを、通信部２０８で受信し、内蔵するハードディスク２０５にインストールすることができる。
【０２２７】
コンピュータは、CPU(Central Processing Unit)２０２を内蔵している。CPU２０２には、バス２０１を介して、入出力インタフェース２１０が接続されており、CPU２０２は、入出力インタフェース２１０を介して、ユーザによって、キーボードや、マウス、マイク等で構成される入力部２０７が操作等されることにより指令が入力されると、それにしたがって、ROM(Read Only Memory)２０３に格納されているプログラムを実行する。あるいは、また、CPU２０２は、ハードディスク２０５に格納されているプログラム、衛星若しくはネットワークから転送され、通信部２０８で受信されてハードディスク２０５にインストールされたプログラム、またはドライブ２０９に装着されたリムーバブル記録媒体２１１から読み出されてハードディスク２０５にインストールされたプログラムを、RAM(Random Access Memory)２０４にロードして実行する。これにより、CPU２０２は、上述したフローチャートにしたがった処理、あるいは上述したブロック図の構成により行われる処理を行う。そして、CPU２０２は、その処理結果を、必要に応じて、例えば、入出力インタフェース２１０を介して、LCD(Liquid CryStal Display)やスピーカ等で構成される出力部２０６から出力、あるいは、通信部２０８から送信、さらには、ハードディスク２０５に記録等させる。
【０２２８】
ここで、本明細書において、コンピュータに各種の処理を行わせるためのプログラムを記述する処理ステップは、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいは個別に実行される処理（例えば、並列処理あるいはオブジェクトによる処理）も含むものである。
【０２２９】
また、プログラムは、１のコンピュータにより処理されるものであっても良いし、複数のコンピュータによって分散処理されるものであっても良い。さらに、プログラムは、遠方のコンピュータに転送されて実行されるものであっても良い。
【０２３０】
なお、本実施の形態では、画像データを対象としたが、本発明は、その他、例えば、音声データにも適用可能である。
【０２３１】
また、本実施の形態では、少なくとも、ＤＣＴ処理を行うＪＰＥＧ符号化やＭＰＥＧ符号化された符号化データの復号を行うようにしたが、本発明は、その他の直交変換または周波数変換によって変換されたデータの復号に適用可能である。即ち、本発明は、例えば、サブバンド符号化されたデータや、フーリエ変換されたデータ等を復号する場合にも適用可能である。
【０２３２】
さらに、本実施の形態では、デコーダ２２において、復号に用いるタップ係数を、あらかじめ記憶しておくようにしたが、タップ係数は、符号化データに含めて、デコーダ２２に提供するようにすることが可能である。
【０２３３】
また、本実施の形態では、タップ係数を用いた線形１次予測演算によって、復号を行うようにしたが、復号は、その他、２次以上の高次の予測演算によって行うことも可能である。
【０２３４】
【発明の効果】
本発明の第１のデータ処理装置およびデータ処理方法、並びに記録媒体によれば、学習を行うことにより求められたタップ係数が取得され、そのタップ係数および変換データを用いて、所定の予測演算が行われることにより、変換データを、元のデータに復号し、かつ、その元のデータに所定の処理を施した処理データが得られる。従って、効率的に、変換データを復号し、かつその復号されたデータに所定の処理を施すことが可能となる。
【０２３５】
本発明の第２のデータ処理装置およびデータ処理方法、並びに記録媒体によれば、教師となる教師データに、所定の処理に基づく処理が施され、その結果得られる準教師データを、少なくとも、直交変換または周波数変換することにより、生徒となる生徒データが生成される。そして、タップ係数および生徒データを用いて予測演算を行うことにより得られる教師データの予測値の予測誤差が、統計的に最小になるように学習が行われ、タップ係数が求められる。従って、そのタップ係数を用いることにより、効率的に、直交変換または周波数変換されたデータを復号し、かつその復号されたデータに所定の処理を施すことが可能となる。
【図面の簡単な説明】
【図１】従来のＪＰＥＧ符号化／復号を説明するための図である。
【図２】本発明を適用した画像伝送システムの一実施の形態の構成例を示す図である。
【図３】図２のデコーダ２２の構成例を示すブロック図である。
【図４】８×８のＤＣＴ係数が、１６×１６画素に復号される様子を示す図である。
【図５】図３のデコーダ２２の処理を説明するフローチャートである。
【図６】図３の係数変換回路３２の第１の構成例を示すブロック図である。
【図７】予測タップとクラスタップの例を説明する図である。
【図８】図６のクラス分類回路４３の構成例を示すブロック図である。
【図９】図６の電力演算回路５１の処理を説明するための図である。
【図１０】図６の係数変換回路３２の処理を説明するフローチャートである。
【図１１】図１０のステップＳ１２の処理のより詳細を説明するフローチャートである。
【図１２】本発明を適用した学習装置の第１実施の形態の構成例を示すブロック図である。
【図１３】図１２の学習装置の処理を説明するフローチャートである。
【図１４】図３の係数変換回路３２の第２の構成例を示すブロック図である。
【図１５】本発明を適用した学習装置の第２実施の形態の構成例を示すブロック図である。
【図１６】図３の係数変換回路３２の第３の構成例を示すブロック図である。
【図１７】本発明を適用した学習装置の第３実施の形態の構成例を示すブロック図である。
【図１８】図３の係数変換回路３２の第４の構成例を示すブロック図である。
【図１９】本発明を適用した学習装置の第４実施の形態の構成例を示すブロック図である。
【図２０】図２のエンコーダ２１の構成例を示すブロック図である。
【図２１】ＭＰＥＧデコーダの一例の構成を示すブロック図である。
【図２２】図３の係数変換回路３２の第５の構成例を示すブロック図である。
【図２３】本発明を適用した学習装置の第５実施の形態の構成例を示すブロック図である。
【図２４】時間解像度を向上させた画像を示す図である。
【図２５】時間解像度を向上させた画像を示す図である。
【図２６】２以上のフレームのＤＣＴ係数から、クラスタップおよび予測タップを構成することを示す図である。
【図２７】本発明を適用したコンピュータの一実施の形態の構成例を示すブロック図である。
【符号の説明】
２１エンコーダ，２２デコーダ，２３記録媒体，２４伝送媒体，３１エントロピー復号回路，３２係数変換回路，３３ブロック分解回路，４１予測タップ抽出回路，４２クラスタップ抽出回路，４３クラス分類回路，４４係数テーブル記憶部，４５積和演算回路，５１電力演算回路，５２クラスコード生成回路，５３閾値テーブル記憶部，６０間引き回路，６１ブロック化回路，６２ＤＣＴ回路，６３量子化回路，６４予測タップ抽出回路，６５クラスタップ抽出回路，６６クラス分類回路，６７正規方程式加算回路，６８タップ係数決定回路，６９係数テーブル記憶部，７１，８１逆量子化回路，１１４演算器，１１５動き補償回路，１２０間引き回路，１２１動きベクトル検出回路，１２２演算器，１２３ブロック化回路，１２４ＤＣＴ回路，１２５量子化回路，１２７逆量子化回路，１２８逆ＤＣＴ回路，１２９演算器，１３０動き補償回路，１３１補間回路，１３２演算器，２０１バス，２０２ CPU，２０３ ROM，２０４ RAM，２０５ハードディスク，２０６出力部，２０７入力部，２０８通信部，２０９ドライブ，２１０入出力インタフェース，２１１リムーバブル記録媒体[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a data processing device, a data processing method, and a recording medium, and in particular, for example, a data processing device, a data processing method, and a recording that are suitable for decoding a compressed image into a high-quality image. It relates to the medium.
[0002]
[Prior art]
For example, since digital image data has a large amount of data, a large-capacity recording medium or transmission medium is required to perform recording or transmission as it is. In general, therefore, image data is compressed and encoded to reduce the amount of data before recording or transmission.
[0003]
As a method for compressing and encoding an image, for example, there are a JPEG (Joint Photographic Experts Group) method that is a compression encoding method for still images, an MPEG (Moving Picture Experts Group) method that is a compression encoding method for moving images, and the like. .
[0004]
For example, encoding / decoding of image data by the JPEG method is performed as shown in FIG.
[0005]
That is, FIG. 1A shows a configuration of an example of a conventional JPEG encoding apparatus.
[0006]
The image data to be encoded is input to the block forming circuit 1, and the block forming circuit 1 divides the image data input thereto into blocks of 64 pixels of 8 × 8 pixels. Each block obtained by the blocking circuit 1 is supplied to a DCT (Discrete Cosine Transform) circuit 2. The DCT circuit 2 performs a DCT (Discrete Cosine Transform) process on the block from the blocking circuit 1 to generate one DC (Direct Current) component and 63 frequency components (horizontal and vertical directions). AC (Alternating Current) component) to a total of 64 DCT coefficients. The 64 DCT coefficients for each block are supplied from the DCT circuit 2 to the quantization circuit 3.
[0007]
The quantization circuit 3 quantizes the DCT coefficient from the DCT circuit 2 in accordance with a predetermined quantization table, and uses the quantization result (hereinafter, appropriately referred to as “quantized DCT coefficient”) for quantization. At the same time, it is supplied to the entropy encoding circuit 4.
[0008]
Here, FIG. 1B shows an example of a quantization table used in the quantization circuit 3. In general, the quantization table takes into account human visual characteristics and finely quantizes low-frequency DCT coefficients that are highly important, and coarsely quantizes low-frequency high-frequency DCT coefficients. Thus, it is possible to suppress the deterioration of the image quality of the image and perform efficient compression.
[0009]
The entropy coding circuit 4 performs entropy coding processing such as Huffman coding on the quantized DCT coefficient from the quantization circuit 3, and adds the quantization table from the quantization circuit 3, for example. The encoded data obtained as a result is output as a JPEG encoding result.
[0010]
Next, FIG. 1C shows a configuration of an example of a conventional JPEG decoding apparatus that decodes encoded data output from the JPEG encoding apparatus of FIG.
[0011]
The encoded data is input to the entropy decoding circuit 11, and the entropy decoding circuit 11 separates the encoded data into entropy-coded quantized DCT coefficients and a quantization table. Further, the entropy decoding circuit 11 entropy-decodes the entropy-coded quantized DCT coefficients, and supplies the resulting quantized DCT coefficients to the inverse quantization circuit 12 together with the quantization table. The inverse quantization circuit 12 inversely quantizes the quantized DCT coefficient from the entropy decoding circuit 11 according to the quantization table from the entropy decoding circuit 11, and supplies the DCT coefficient obtained as a result to the inverse DCT circuit 13. . The inverse DCT circuit 13 performs inverse DCT processing on the DCT coefficients from the inverse quantization circuit 12 and supplies the resulting 8 × 8 pixel decoded block to the block decomposition circuit 14. The block decomposition circuit 14 obtains and outputs a decoded image by unblocking the decoded block from the inverse DCT circuit 13.
[0012]
[Problems to be solved by the invention]
In the JPEG encoding apparatus of FIG. 1A, the data amount of encoded data can be reduced by increasing the quantization step of the quantization table used for block quantization in the quantization circuit 3. . That is, high compression can be realized.
[0013]
However, if the quantization step is increased, so-called quantization error also increases, so that the image quality of the decoded image obtained by the JPEG decoding device in FIG. That is, blur, block distortion, mosquito noise, and the like appear significantly in the decoded image.
[0014]
Therefore, in order to prevent the image quality of the decoded image from deteriorating while reducing the data amount of the encoded data, or to maintain the data amount of the encoded data and improve the image quality of the decoded image, JPEG After decoding, it is necessary to perform some processing for improving image quality.
[0015]
However, performing the processing for improving the image quality after JPEG decoding makes the processing complicated, and the time until a decoded image is finally obtained also becomes long.
[0016]
The present invention has been made in view of such a situation, and makes it possible to efficiently obtain a decoded image with good image quality from a JPEG-encoded image or the like.
[0017]
[Means for Solving the Problems]
  The first data processing apparatus according to the present invention performs a predetermined prediction calculation using an acquisition unit that acquires a tap coefficient obtained by performing learning, and the tap coefficient and the conversion data. Decrypt to the original dataAt the same time, Obtain the processed data by applying the predetermined processing to the original dataRuAnd an arithmetic means.
[0018]
In the first data processing device, the calculation means can perform linear primary prediction calculation using the tap coefficient and the conversion data.
[0019]
The first data processing apparatus can further be provided with storage means for storing tap coefficients. In this case, the acquisition means can acquire the tap coefficients from the storage means.
[0020]
In the first data processing apparatus, the transformed data can be obtained by subjecting the original data to orthogonal transformation or frequency transformation and further quantization.
[0021]
The first data processing apparatus can further include an inverse quantization means for inversely quantizing the transformed data, and the computing means can perform a prediction computation using the inversely quantized transformed data. .
[0022]
In the first data processing apparatus, the converted data may be obtained by performing at least discrete cosine transform on the original data.
[0023]
The first data processing device may further include prediction tap extraction means for extracting conversion data used together with the tap coefficient to predict the attention data of interest among the processing data, and outputting it as a prediction tap. In this case, the calculation means can perform the prediction calculation using the prediction tap and the tap coefficient.
[0024]
In the first data processing device, class tap extraction means for extracting conversion data used for classifying attention data into any one of several classes and outputting it as a class tap, and based on the class tap And classifying means for classifying the class of attention data. In this case, the arithmetic means performs prediction calculation using a prediction tap and a tap coefficient corresponding to the class of attention data. Can be made.
[0025]
In the first data processing apparatus, the computing means can obtain processing data obtained by performing processing for improving the quality of the original data by performing a predetermined prediction computation.
[0026]
In the first data processing device, the tap coefficient is statistically minimized so that the prediction error of the predicted value of the processing data obtained by performing a predetermined prediction calculation using the tap coefficient and the conversion data is statistically minimized. It can be obtained by learning.
[0027]
In the first data processing apparatus, the original data can be moving image or still image data.
[0028]
In the first data processing apparatus, the calculation means can obtain processing data obtained by performing processing for improving the image quality of the image data by performing a predetermined prediction calculation.
[0029]
In the first data processing apparatus, the calculation means can obtain processing data in which the resolution of the image data in the time or spatial direction is improved.
[0030]
  According to the first data processing method of the present invention, an acquisition step of acquiring a tap coefficient obtained by performing learning, and a predetermined prediction calculation using the tap coefficient and the conversion data, Decrypt to the original dataAt the same time, Obtain the processed data by applying the predetermined processing to the original dataRuAnd a calculation step.
[0031]
  The first recording medium of the present invention obtains the converted data by performing a predetermined prediction operation using the acquisition step of acquiring the tap coefficient obtained by learning and the tap coefficient and the converted data. Decrypt toAt the same time, Obtain the processed data by applying the predetermined processing to the original dataRuA program comprising a calculation step is recorded.
[0032]
The second data processing apparatus according to the present invention includes a semi-teacher data generation unit that obtains semi-teacher data by performing a process based on a predetermined process on teacher data serving as a teacher, and at least orthogonal transformation or frequency of the semi-teacher data. By converting, student data generating means for generating student data to be a student, and the prediction error of the predicted value of the teacher data obtained by performing the prediction calculation using the tap coefficient and the student data is statistically minimized. And learning means for obtaining a tap coefficient and learning.
[0033]
In the second data processing apparatus, the learning means is configured so that the prediction error of the predicted value of the teacher data obtained by performing the linear primary prediction calculation using the tap coefficient and the student data is statistically minimized. Learning can be done.
[0034]
In the second data processing device, the student data generation means can generate student data by performing orthogonal transformation or frequency transformation on the semi-teacher data and further quantizing the data.
[0035]
In the second data processing device, the student data generating means can generate student data by quantizing the semi-teacher data by orthogonal transform or frequency transform, and further dequantizing the quantized data.
[0036]
In the second data processing apparatus, the student data generation means can generate student data by performing at least discrete cosine transform on the semi-teacher data.
[0037]
The second data processing device is further provided with prediction tap extraction means for extracting student data used together with the tap coefficient for predicting the attention teacher data of interest among the teacher data, and outputting it as a prediction tap. In this case, the learning means may perform learning so that the prediction error of the predicted value of the teacher data obtained by performing the prediction calculation using the prediction tap and the tap coefficient is statistically minimized. Can do.
[0038]
The second data processing apparatus includes: class tap extracting means for extracting student data used for classifying the teacher data of interest into one of several classes, and outputting the data as class taps; And a class classification unit for classifying the class of the attention teacher data based on the prediction tap and the tap coefficient corresponding to the class of the attention teacher data. Learning can be performed so that the prediction error of the predicted value of the teacher data obtained by performing the operation is statistically minimized, and the tap coefficient for each class can be obtained.
[0039]
In the second data processing apparatus, the student data generation means can generate student data by performing at least orthogonal transform processing or frequency conversion of the semi-teacher data for each predetermined unit.
[0040]
In the second data processing apparatus, the semi-teacher data generation means can generate semi-teacher data by performing a process for degrading the quality of the teacher data.
[0041]
In the second data processing apparatus, the teacher data can be moving image or still image data.
[0042]
In the second data processing apparatus, the semi-teacher data generation unit can generate semi-teacher data by performing a process of degrading the image quality of the image data.
[0043]
In the second data processing apparatus, the semi-teacher data generation means can generate semi-teacher data in which the resolution of the image data in time or space direction is degraded.
[0044]
The second data processing method of the present invention provides a semi-teacher data generation step for obtaining semi-teacher data by performing processing based on a predetermined process on the teacher data serving as a teacher, and at least orthogonal transformation or frequency of the semi-teacher data. By converting, the student data generation step for generating student data as students, and the prediction error of the predicted value of the teacher data obtained by performing the prediction calculation using the tap coefficient and the student data are statistically minimized. And a learning step for obtaining a tap coefficient.
[0045]
The second recording medium of the present invention includes a semi-teacher data generation step for obtaining semi-teacher data by performing a process based on a predetermined process on teacher data serving as a teacher, and at least orthogonal transform or frequency transform of the semi-teacher data. As a result, the student data generation step for generating the student data to be a student and the prediction error of the predicted value of the teacher data obtained by performing the prediction calculation using the tap coefficient and the student data are statistically minimized. In this manner, a program including a learning step for learning and obtaining a tap coefficient is recorded.
[0046]
  In the first data processing apparatus, data processing method, and recording medium of the present invention, a tap coefficient obtained by learning is acquired, and a predetermined prediction calculation is performed using the tap coefficient and converted data. The converted data is decrypted to the original data.At the same time, Processed data that has undergone predetermined processing on its original dataIs obtained.
[0047]
In the second data processing apparatus, data processing method, and recording medium of the present invention, processing based on predetermined processing is performed on teacher data serving as a teacher, and at least orthogonal transformation is performed on the resulting semi-teacher data. Alternatively, student data to be students is generated by frequency conversion. Then, learning is performed so that the prediction error of the predicted value of the teacher data obtained by performing the prediction calculation using the tap coefficient and the student data is statistically minimized, and the tap coefficient is obtained.
[0048]
DETAILED DESCRIPTION OF THE INVENTION
FIG. 2 shows a configuration example of an embodiment of an image transmission system to which the present invention is applied.
[0049]
The image data to be transmitted is supplied to the encoder 21. The encoder 21 performs, for example, JPEG encoding on the image data supplied thereto to obtain encoded data. That is, the encoder 21 is configured in the same manner as the JPEG encoding apparatus shown in FIG. 1A, for example, and JPEG encodes image data. The encoded data obtained by JPEG encoding by the encoder 21 is recorded on a recording medium 23 made of, for example, a semiconductor memory, a magneto-optical disk, a magnetic disk, an optical disk, a magnetic tape, a phase change disk, or the like. For example, the transmission is performed via a transmission medium 24 including a terrestrial wave, a satellite line, a CATV (Cable Television) network, the Internet, and a public line.
[0050]
The decoder 22 receives the encoded data provided via the recording medium 23 or the transmission medium 24 and decodes it into high-quality image data. The decoded high-quality image data is supplied to, for example, a monitor (not shown) and displayed.
[0051]
Next, FIG. 3 shows a configuration example of the decoder 22 of FIG.
[0052]
The encoded data is supplied to the entropy decoding circuit 31. The entropy decoding circuit 31 entropy-decodes the encoded data, and obtains the quantized DCT coefficient Q for each block obtained as a result as a coefficient. This is supplied to the conversion circuit 32. The encoded data includes a quantization table in addition to the quantized DCT coefficient subjected to entropy encoding, as in the case described with the entropy decoding circuit 11 in FIG. 1C. Can be used for decoding the quantized DCT coefficients, as will be described later.
[0053]
The coefficient conversion circuit 32 performs a predetermined prediction operation using the quantized DCT coefficient Q from the entropy decoding circuit 31 and a tap coefficient obtained by performing learning described later, thereby performing a quantized DCT coefficient for each block. Is converted into an 8 × 8 pixel original block, and further, data subjected to processing for improving the image quality of the original block is obtained. That is, the original block is composed of 8 × 8 pixels, but the coefficient conversion circuit 32 performs the prediction calculation using the tap coefficient, thereby reducing the horizontal and vertical spatial resolutions of the 8 × 8 pixel block. , Both obtain a block of 16 × 16 pixels that is doubled. Accordingly, here, the coefficient conversion circuit 32 decodes and outputs a block composed of 8 × 8 quantized DCT coefficients to a block composed of 16 × 16 pixels, as shown in FIG.
[0054]
The block decomposition circuit 33 obtains and outputs a decoded image with improved spatial resolution by unblocking the 16 × 16 pixel block obtained in the coefficient conversion circuit 32.
[0055]
Next, processing of the decoder 22 in FIG. 3 will be described with reference to the flowchart in FIG.
[0056]
The encoded data is sequentially supplied to the entropy decoding circuit 31. In step S1, the entropy decoding circuit 31 entropy decodes the encoded data and supplies the quantized DCT coefficient Q for each block to the coefficient conversion circuit 32. In step S2, the coefficient conversion circuit 32 decodes the quantized DCT coefficient Q for each block from the entropy decoding circuit 31 into a pixel value for each block by performing a prediction calculation using a tap coefficient, and A so-called high-resolution block in which the spatial resolution of the block is improved is obtained and supplied to the block decomposition circuit 33. In step S3, the block decomposition circuit 33 performs block decomposition to unblock the pixel value block with improved spatial resolution from the coefficient conversion circuit 32, and outputs a high-resolution decoded image obtained as a result. The process is terminated.
[0057]
Next, the coefficient conversion circuit 32 of FIG. 3 can decode the quantized DCT coefficients into pixel values using, for example, class classification adaptation processing, and obtain an image with improved spatial resolution. .
[0058]
Class classification adaptive processing consists of class classification processing and adaptive processing. Data is classified into classes based on their properties by class classification processing, and adaptive processing is performed for each class. It is of the technique like. Here, in order to simplify the description, the case where the quantized DCT coefficients are decoded into the original image will be described as an example of the adaptive processing.
[0059]
In this case, in the adaptive processing, for example, the quantized DCT coefficient is decoded into the original pixel value by obtaining the predicted value of the original pixel by linear combination of the quantized DCT coefficient and a predetermined tap coefficient. .
[0060]
Specifically, for example, a certain image is used as teacher data, and the image is subjected to DCT processing in units of blocks, and further quantized DCT coefficients obtained by quantization are used as student data, and the pixel of the teacher data is determined. The predicted value E [y] of the pixel value y is converted into several quantized DCT coefficients x₁, X₂, ... and a predetermined tap coefficient w₁, W₂Consider a linear primary combination model defined by the linear combination of. In this case, the predicted value E [y] can be expressed by the following equation.
[0061]
E [y] = w₁x₁+ W₂x₂+ ...
... (1)
[0062]
To generalize equation (1), tap coefficient w_jA matrix W consisting of_ijAnd a predicted value E [y_j] Is a matrix Y ′ consisting of
[Expression 1]

Then, the following observation equation holds.
[0063]
XW = Y ’
... (2)
Here, the component x of the matrix X_ijIs a set of i-th student data (i-th teacher data y_iThe j-th student data in the set of student data used for the prediction of_jRepresents a tap coefficient by which a product with the jth student data in the student data set is calculated. Y_iRepresents the i-th teacher data, and thus E [y_i] Represents the predicted value of the i-th teacher data. Note that y on the left side of Equation (1) is the component y of the matrix Y._iThe suffix i is omitted, and x on the right side of Equation (1)₁, X₂,... Are also components x of the matrix X_ijThe suffix i is omitted.
[0064]
Then, it is considered to apply the least square method to this observation equation to obtain a predicted value E [y] close to the original pixel value y. In this case, a matrix Y composed of a set of true pixel values y serving as teacher data and a matrix E composed of a set of residuals e of predicted values E [y] for the pixel values y are
[0065]
[Expression 2]

From the equation (2), the following residual equation is established.
[0066]
XW = Y + E
... (3)
[0067]
In this case, the tap coefficient w for obtaining the predicted value E [y] close to the original pixel value y._jIs the square error
[Equation 3]

Can be obtained by minimizing.
[0068]
Therefore, the above square error is converted to the tap coefficient w._jWhen the value differentiated by 0 is 0, that is, the tap coefficient w satisfying the following equation:_jHowever, this is the optimum value for obtaining the predicted value E [y] close to the original pixel value y.
[0069]
[Expression 4]

... (4)
[0070]
Therefore, first, the equation (3) is changed to the tap coefficient w._jIs differentiated by the following equation.
[0071]
[Equation 5]

... (5)
[0072]
From equations (4) and (5), equation (6) is obtained.
[0073]
[Formula 6]

... (6)
[0074]
Furthermore, the student data x in the residual equation of equation (3)_ij, Tap coefficient w_j, Teacher data y_iAnd residual e_iConsidering this relationship, the following normal equation can be obtained from the equation (6).
[0075]
[Expression 7]

... (7)
[0076]
In addition, the normal equation shown in Expression (7) has a matrix (covariance matrix) A and a vector v,
[Equation 8]

And the vector W is defined as shown in Equation 1,
AW = v
... (8)
Can be expressed as
[0077]
Each normal equation in equation (7) is the student data x_ijAnd teacher data y_iBy preparing a certain number of sets, a tap coefficient w to be obtained_jTherefore, by solving equation (8) with respect to vector W (however, to solve equation (8), matrix A in equation (8) is regular). Required), the optimal tap coefficient (here, the tap coefficient that minimizes the square error) w_jCan be requested. In solving the equation (8), for example, a sweeping method (Gauss-Jordan elimination method) or the like can be used.
[0078]
As described above, the optimum tap coefficient w_jAnd tap coefficient w_jThe adaptive processing is to obtain the predicted value E [y] close to the original pixel value y using the equation (1).
[0079]
For example, when an image having the same image quality as an image to be JPEG-encoded is used as teacher data, and a quantized DCT coefficient obtained by DCT and quantizing the teacher data is used as student data, tap coefficients are In this case, when JPEG-encoded image data is decoded into the original image data, an image having a statistically minimum prediction error is obtained.
[0080]
Therefore, even if the compression rate at the time of performing JPEG encoding is increased, that is, even if the quantization step used for quantization is rough, according to the adaptive processing, a decoding process in which the prediction error is statistically minimized. In effect, a JPEG-encoded image decoding process and a process for improving the image quality (hereinafter referred to as an improvement process as appropriate) are performed simultaneously. . As a result, the image quality of the decoded image can be maintained even when the compression rate is increased.
[0081]
Further, for example, an image with higher image quality than the image to be JPEG encoded is used as the teacher data, and the image quality of the teacher data is deteriorated to the same image quality as the image to be JPEG encoded as the student data. When quantized DCT coefficients obtained by quantization are used, as tap coefficients, JPEG-encoded image data is decoded into high-quality image data, and the prediction error is statistically minimized. Will be obtained.
[0082]
Therefore, also in this case, according to the adaptive processing, the decoding processing of the JPEG-encoded image and the improvement processing for further improving the image quality are performed at the same time. Note that, as described above, it is possible to obtain a tap coefficient that sets the image quality of the decoded image to an arbitrary level by changing the image quality of the image that becomes teacher data or student data.
[0083]
FIG. 6 shows a first configuration example of the coefficient conversion circuit 32 of FIG. 3 that decodes the quantized DCT coefficients into pixel values by the class classification adaptive processing as described above.
[0084]
The quantized DCT coefficients for each block output from the entropy decoding circuit 31 (FIG. 3) are supplied to the prediction tap extraction circuit 41 and the class tap extraction circuit.
[0085]
The prediction tap extraction circuit 41 is a block of high-quality pixel values corresponding to a block of 8 × 8 quantized DCT coefficients supplied thereto (hereinafter referred to as a DCT block as appropriate). Although not present at the stage, it is virtually assumed) (hereinafter referred to as a high-quality block as appropriate) (in this embodiment, as described above, a block of 16 × 16 pixels) is sequentially added to the high-quality block of interest. Further, each pixel constituting the target high-quality block is sequentially set as the target pixel in the so-called raster scan order, for example. Further, the prediction tap extraction circuit 41 extracts a quantized DCT coefficient used for predicting the pixel value of the target pixel and sets it as a prediction tap.
[0086]
That is, the prediction tap extraction circuit 41, for example, as shown in FIG. 7, all the quantized DCT coefficients of the DCT block corresponding to the high-quality block to which the target pixel belongs, that is, 64 × 8 quantized DCTs of 8 × 8. Coefficients are extracted as prediction taps. Therefore, in this embodiment, the same prediction tap is configured for all pixels of a certain high-quality block. However, the prediction tap can be configured with different quantized DCT coefficients for each pixel of interest.
[0087]
The prediction taps for each pixel constituting the high-quality block obtained by the prediction tap extraction circuit 41, that is, 256 sets of prediction taps for each of 16 × 16 256 pixels, are supplied to the product-sum operation circuit 45. However, in the present embodiment, as described above, the same prediction tap is configured for all the pixels of the high-quality block, so in fact, one set of prediction taps for one high-quality block. May be supplied to the product-sum operation circuit 45.
[0088]
The class tap extraction circuit 42 extracts a quantized DCT coefficient used for class classification for classifying the pixel of interest into one of several classes, and sets it as a class tap.
[0089]
In JPEG encoding, an image is encoded (DCT processing and quantization) for each 8 × 8 pixel block (hereinafter referred to as a pixel block as appropriate). For example, all the pixels belonging to the image quality block are classified into the same class. Therefore, the class tap extraction circuit 42 configures the same class tap for each pixel of a certain high quality block. That is, the class tap extraction circuit 42, for example, as in the prediction tap extraction circuit 41, all the 8 × 8 DCT blocks corresponding to the high-quality block to which the target pixel belongs as shown in FIG. Quantized DCT coefficients are extracted as class taps.
[0090]
Here, classifying all pixels belonging to a high quality block into the same class is equivalent to classifying the high quality block. Therefore, the class tap extraction circuit 42 does not have 256 sets of class taps for classifying each of the total of 256 pixels of 16 × 16 constituting the target high-quality block, but 1 for classifying the target high-quality block. For this reason, the class tap extraction circuit 42 classifies 64 high-quality blocks corresponding to the high-quality block in order to classify the high-quality block for each high-quality block. The quantized DCT coefficients are extracted to form class taps.
[0091]
Note that the quantized DCT coefficients constituting the prediction tap and the class tap are not limited to those having the above-described pattern.
[0092]
The class tap of the high-quality block of interest obtained in the class tap extraction circuit 42 is supplied to the class classification circuit 43. The class classification circuit 43 is based on the class tap from the class tap extraction circuit 42, Classify the high-quality block of interest and class code corresponding to the resulting class is output.
[0093]
Here, as a method of classifying, for example, ADRC (Adaptive Dynamic Range Coding) or the like can be employed.
[0094]
In the method using ADRC, the quantized DCT coefficients constituting the class tap are subjected to ADRC processing, and the class of the high-quality block of interest is determined according to the ADRC code obtained as a result.
[0095]
In the K-bit ADRC, for example, the maximum value MAX and the minimum value MIN of the quantized DCT coefficient constituting the class tap are detected, and DR = MAX-MIN is set as a local dynamic range of the set, and this dynamic range Based on DR, the quantized DCT coefficients constituting the class tap are requantized to K bits. That is, the minimum value MIN is subtracted from the quantized DCT coefficients constituting the class tap, and the subtracted value is DR / 2.^KDivide by (quantize). Then, a bit string obtained by arranging the K-bit quantized DCT coefficients constituting the class tap in a predetermined order, which is obtained as described above, is output as an ADRC code. Therefore, when a class tap is subjected to, for example, 1-bit ADRC processing, each quantized DCT coefficient constituting the class tap is an average of the maximum value MAX and the minimum value MIN after the minimum value MIN is subtracted. Dividing by the value, each quantized DCT coefficient becomes 1 bit (binarized). Then, a bit string in which the 1-bit quantized DCT coefficients are arranged in a predetermined order is output as an ADRC code.
[0096]
Note that the class classification circuit 43 can output the level distribution pattern of the quantized DCT coefficients constituting the class tap as it is as a class code, for example. In this case, the class tap has N class taps. Assuming that each of the quantized DCT coefficients is composed of quantized DCT coefficients and K bits are assigned to each quantized DCT coefficient, the number of class codes output by the class classification circuit 43 is (2^N)^KAs a result, it becomes a huge number that is exponentially proportional to the number of bits K of the quantized DCT coefficient.
[0097]
Therefore, the class classification circuit 43 preferably performs class classification after compressing the information amount of the class tap by the above-described ADRC processing or vector quantization.
[0098]
By the way, in this embodiment, the class tap is composed of 64 quantized DCT coefficients as described above. Therefore, for example, even if class classification is performed by performing 1-bit ADRC processing on a class tap, the number of class codes is 2⁶⁴It becomes a big value of street.
[0099]
Therefore, in the present embodiment, the class classification circuit 43 extracts feature quantities having high importance from the quantized DCT coefficients constituting the class tap, and performs class classification based on the feature quantities, thereby obtaining the number of classes. Is to be reduced.
[0100]
That is, FIG. 8 shows a configuration example of the class classification circuit 43 of FIG.
[0101]
The class tap is supplied to the power calculation circuit 51, and the power calculation circuit 51 divides the quantized DCT coefficients constituting the class tap into those of several spatial frequency bands. Calculate power.
[0102]
That is, the power calculation circuit 51 converts the 8 × 8 quantized DCT coefficients constituting the class tap into, for example, four spatial frequency bands S as shown in FIG.₀, S₁, S₂, S_ThreeDivide into
[0103]
Here, it is assumed that each of 8 × 8 quantized DCT coefficients constituting the class tap is represented by adding a sequential integer from 0 to the alphabet x in the raster scan order as shown in FIG. , Spatial frequency band S₀Is the four quantized DCT coefficients x₀, X₁, X₈, X₉And the spatial frequency band S₁Is the 12 quantized DCT coefficients x₂, X_Three, X_Four, X_Five, X₆, X₇, X_Ten, X₁₁, X₁₂, X₁₃, X₁₄, X₁₅Consists of In addition, the spatial frequency band S₂Is the 12 quantized DCT coefficients x₁₆, X₁₇, X_{twenty four}, X_{twenty five}, X₃₂, X₃₃, X₄₀, X₄₁, X₄₈, X₄₉, X₅₆, X₅₇And the spatial frequency band S_ThreeIs the 36 quantized DCT coefficients x₁₈, X₁₉, X₂₀, X_{twenty one}, X_{twenty two}, X_{twenty three}, X₂₆, X₂₇, X₂₈, X₂₉, X₃₀, X₃₁, X₃₄, X₃₅, X₃₆, X₃₇, X₃₈, X₃₉, X₄₂, X₄₃, X₄₄, X₄₅, X₄₆, X₄₇, X₅₀, X₅₁, X₅₂, X₅₃, X₅₄, X₅₅, X₅₈, X₅₉, X₆₀, X₆₁, X₆₂, X₆₃Consists of
[0104]
Further, the power calculation circuit 51 has a spatial frequency band S₀, S₁, S₂, S_ThreeFor each, the power P of the AC component of the quantized DCT coefficient₀, P₁, P₂, P_ThreeIs output to the class code generation circuit 52.
[0105]
That is, the power calculation circuit 51 has the spatial frequency band S₀For the above four quantized DCT coefficients x₀, X₁, X₈, X₉AC component x of₁, X₈, X₉Sum of squares x₁ ²+ X₈ ²+ X₉ ²And this is the power P₀Is output to the class code generation circuit 52. Further, the power calculation circuit 51 obtains the AC component of the above-described 12 quantized DCT coefficients for the spatial frequency band S1, that is, the sum of squares of all the 12 quantized DCT coefficients.₁Is output to the class code generation circuit 52. Further, the power calculation circuit 51 has a spatial frequency band S₂And S_ThreeAlso for the spatial frequency band S₁In the same way as in FIG.₂And P_ThreeAnd is output to the class code generation circuit 52.
[0106]
The class code generation circuit 52 receives the power P from the power calculation circuit 51.₀, P₁, P₂, P_ThreeAre respectively compared with the corresponding threshold values TH0, TH1, TH2, and TH3 stored in the threshold value table storage unit 53, and class codes are output based on the respective magnitude relationships. That is, the class code generation circuit 52 uses the power P₀Is compared with the threshold value TH0, and a 1-bit code representing the magnitude relationship is obtained. Similarly, the class code generation circuit 52 uses the power P₁And threshold TH1, power P₂And threshold TH2, power P_ThreeAnd the threshold TH3 are respectively compared to obtain a 1-bit code. Then, the class code generation circuit 52, for example, a 4-bit code obtained by arranging the four 1-bit codes obtained as described above in a predetermined order (accordingly, any one of 0 to 15). Is output as a class code representing the class of the high-quality block of interest. Therefore, in this embodiment, the target high-quality block is 2^FourIt is classified into one of (= 16) classes.
[0107]
The threshold table storage unit 53 includes a spatial frequency band S₀Thru S_ThreePower P₀Thru P_ThreeThe thresholds TH0 to TH3 to be compared with each are stored.
[0108]
In the above case, the DC component x of the quantized DCT coefficient is included in the classification process.₀Is not used, but this DC component x₀It is also possible to perform class classification processing using.
[0109]
Returning to FIG. 6, the class code output from the class classification circuit 43 as described above is given to the coefficient table storage unit 44 as an address.
[0110]
The coefficient table storage unit 44 stores a coefficient table in which tap coefficients obtained by performing learning processing described later are registered, and is stored at an address corresponding to the class code output by the class classification circuit 43. The tap coefficient is output to the product-sum operation circuit 45.
[0111]
Here, in the present embodiment, one class code is obtained for the high-quality block of interest. On the other hand, in the present embodiment, the high-quality block is composed of 256 pixels of 16 × 16 pixels. Therefore, for the high-quality block of interest, 256 sets of tap coefficients for decoding each of the 256 pixels constituting the high-quality block. is required. Accordingly, the coefficient table storage unit 44 stores 256 sets of tap coefficients for the address corresponding to one class code.
[0112]
The product-sum operation circuit 45 acquires the prediction tap output from the prediction tap extraction circuit 41 and the tap coefficient output from the coefficient table storage unit 44, and uses the prediction tap and the tap coefficient to obtain Equation (1). The block prediction circuit 33 (the predicted value) of the 16 × 16 pixels of the target high-quality block obtained as a result of the linear prediction calculation (product-sum operation) shown in FIG. Output to FIG.
[0113]
Here, in the prediction tap extraction circuit 41, as described above, each pixel of the target high-quality block is sequentially set as the target pixel. However, the product-sum operation circuit 45 determines the target pixel of the target high-quality block as the target pixel. Processing is performed in an operation mode (hereinafter referred to as a pixel position mode as appropriate) corresponding to the pixel position.
[0114]
That is, for example, among the pixels of the target high-quality block, the i-th pixel in the raster scan order is set to p._iAnd the pixel p_iHowever, if it is the pixel of interest, the product-sum operation circuit 45 performs the processing of the pixel position mode #i.
[0115]
Specifically, as described above, the coefficient table storage unit 44 outputs 256 sets of tap coefficients for decoding each of the 256 pixels constituting the target high-quality block._iA set of tap coefficients for decoding_iWhen the operation mode is the pixel position mode #i, the product-sum operation circuit 45 represents the prediction tap and the set W among the 256 sets of tap coefficients._iAnd the product-sum operation of Expression (1) is performed, and the product-sum operation result is expressed as pixel p._iIs the decoding result.
[0116]
Next, processing of the coefficient conversion circuit 32 of FIG. 6 will be described with reference to the flowchart of FIG.
[0117]
The quantized DCT coefficients for each block output from the entropy decoding circuit 31 (FIG. 3) are sequentially received by the prediction tap extraction circuit 41 and the class tap extraction circuit 42, and the prediction tap extraction circuit 41 supplies the quantization supplied thereto. A high-quality block corresponding to a block of DCT coefficients (DCT block) is sequentially set as a noticeable high-quality block.
[0118]
Then, in step S11, the class tap extraction circuit 42 extracts a class tap from the received quantized DCT coefficients to be used for classifying the target high-quality block, and forms a class tap. 43.
[0119]
In step S 12, the class classification circuit 43 classifies the target high-quality block using the class tap from the class tap extraction circuit 42, and outputs the class code obtained as a result to the coefficient table storage unit 44.
[0120]
That is, in step S12, as shown in the flowchart of FIG. 11, first, in step S21, the power calculation circuit 51 of the class classification circuit 43 (FIG. 8) performs 8 × 8 quantizations constituting the class tap. The DCT coefficients are represented by the four spatial frequency bands S shown in FIG.₀Thru S_ThreeDivided into each power P₀Thru P_ThreeIs calculated. This power P₀Thru P_ThreeIs output from the power calculation circuit 51 to the class code generation circuit 52.
[0121]
In step S22, the class code generation circuit 52 reads the thresholds TH0 to TH3 from the threshold table storage unit 53, and the power P from the power calculation circuit 51 is read.₀Thru P_ThreeEach is compared with each of the thresholds TH0 to TH3, class codes based on the respective magnitude relationships are generated, and the process returns.
[0122]
Returning to FIG. 10, the class code obtained as described above in step S <b> 12 is given as an address from the class classification circuit 43 to the coefficient table storage unit 44.
[0123]
When the coefficient table storage unit 44 receives the class code as the address from the class classification circuit 43, in step S13, the coefficient table storage unit 44 stores 256 sets of tap coefficients (256 sets of taps corresponding to the class of the class code). Coefficient) is read out and output to the product-sum operation circuit 45.
[0124]
Then, the process proceeds to step S14, and the prediction tap extraction circuit 41 sets a pixel value of the target pixel as a target pixel that is not yet set as the target pixel in the raster scan order among the pixels of the target high-quality block. A quantized DCT coefficient used for prediction is extracted and configured as a prediction tap. This prediction tap is supplied from the prediction tap extraction circuit 41 to the product-sum operation circuit 45.
[0125]
Here, in the present embodiment, for each high-quality block, the same prediction tap is configured for all the pixels of the high-quality block. If only the pixel that is the first pixel of interest is performed, there is no need to perform the remaining 255 pixels.
[0126]
In step S15, the product-sum operation circuit 45 acquires a set of tap coefficients corresponding to the pixel position mode for the pixel of interest from among the 256 sets of tap coefficients output from the coefficient table storage unit 44 in step S13, and the tap coefficients. And the prediction tap supplied from the prediction tap extraction circuit 41 in step S14, the product-sum operation shown in Expression (1) is performed to obtain the decoded value of the pixel value of the target pixel.
[0127]
Then, the process proceeds to step S16, and the prediction tap extraction circuit 41 determines whether or not processing has been performed on all pixels of the target high-quality block as the target pixel. If it is determined in step S16 that all the pixels in the target high-quality block have not been processed as target pixels, the process returns to step S14, and the prediction tap extraction circuit 41 selects the pixels in the target high-quality block. In the raster scan order, a pixel that has not yet been set as the target pixel is newly set as the target pixel, and the same processing is repeated.
[0128]
If it is determined in step S16 that all the pixels of the target high-quality block have been processed as the target pixel, that is, the decoded values (8 × 8 quantized DCT coefficients) of all the pixels of the target high-quality block. Is calculated to 8 × 8 pixels, and the 8 × 8 pixels having a high image quality of 16 × 16 pixels) is obtained, the product-sum operation circuit 45 is configured with the decoded values. The high-quality block is output to the block decomposition circuit 33 (FIG. 3), and the process ends.
[0129]
Note that the process according to the flowchart of FIG. 10 is repeated each time the prediction tap extraction circuit 41 sets a new high-quality block of interest.
[0130]
Next, FIG. 12 shows a configuration example of an embodiment of a learning apparatus that performs learning processing of tap coefficients stored in the coefficient table storage unit 44 of FIG.
[0131]
One or more pieces of image data for learning are supplied to the thinning circuit 60 as teacher data to be a teacher at the time of learning. The thinning circuit 60 uses FIG. The product-sum operation circuit 45 in the coefficient conversion circuit 32 performs a process based on the improvement process performed by performing the product-sum operation using the tap coefficient. That is, here, the improvement processing is processing for converting 8 × 8 pixels into a high-quality image of 16 × 16 pixels (with improved resolution) obtained by doubling the horizontal and vertical spatial resolutions. Therefore, the thinning circuit 60 thins out the pixels of the image data as the teacher data, and generates image data (hereinafter, referred to as semi-teacher data as appropriate) in which the number of horizontal and vertical pixels is halved.
[0132]
The image data as the semi-teacher data has the same image quality (resolution) as the image data to be JPEG encoded in the encoder 21 (FIG. 1). If the image to be processed is an SD (Standard Density) image, it is necessary to use an HD (High Density) image in which the number of horizontal and vertical pixels of the SD image is doubled as the teacher data image. is there.
[0133]
The blocking circuit 61 blocks an SD image as one or more pieces of semi-teacher data generated by the thinning circuit 60 into a pixel block of 8 × 8 pixels as in the case of JPEG encoding.
[0134]
The DCT circuit 62 sequentially reads out the pixel blocks formed into blocks by the blocking circuit 61, and performs DCT processing on the pixel blocks to form a block of DCT coefficients. This block of DCT coefficients is supplied to the quantization circuit 63.
[0135]
The quantization circuit 63 quantizes the block of DCT coefficients from the DCT circuit 62 according to the same quantization table used for JPEG encoding in the encoder 21 (FIG. 2), and the resulting quantized DCT coefficients Are sequentially supplied to the prediction tap extraction circuit 64 and the class tap extraction circuit 65.
[0136]
The prediction tap extraction circuit 64 uses the prediction tap extraction circuit of FIG. 6 for the pixel that is the target pixel among the 16 × 16 pixels that form the high-quality block that the later-described normal equation addition circuit 67 uses as the target high-quality block. The same prediction tap as that of 41 is configured by extracting the necessary quantized DCT coefficients from the output of the quantization circuit 63. The prediction tap is supplied from the prediction tap extraction circuit 64 to the normal equation addition circuit 67 as student data to be a student at the time of learning.
[0137]
The class tap extraction circuit 65 extracts the necessary quantized DCT coefficient from the output of the quantization circuit 63 for the target high-quality block from the output of the quantization circuit 63, which is the same class tap as the class tap extraction circuit 42 of FIG. Consists of. This class tap is supplied from the class tap extraction circuit 65 to the class classification circuit 66.
[0138]
The class classification circuit 66 performs the same processing as the class classification circuit 43 in FIG. 6 using the class tap from the class tap extraction circuit 65, thereby classifying the high-quality block of interest and class code obtained as a result Is supplied to the normal equation adding circuit 67.
[0139]
The normal equation addition circuit 67 is supplied with the same HD image as that supplied to the thinning circuit 60 as the teacher data. The normal equation addition circuit 67 converts the HD image into 16 × 16 pixels. The high-quality block is made a block, and the high-quality block is sequentially set as a noticeable high-quality block. Further, the normal equation addition circuit 67 sequentially sets, for example, pixels that are not yet set as the target pixel in the raster scan order among the 16 × 16 pixels constituting the target high-quality block, and sets the target pixel ( Pixel values) and prediction taps from the prediction tap configuration circuit 64 (quantized DCT coefficients) are added.
[0140]
That is, the normal equation adding circuit 67 is a component in the matrix A of Expression (8) using a prediction tap (student data) for each class corresponding to the class code supplied from the class classification circuit 66. Multiplication of student data (x_inx_im) And a calculation corresponding to summation (Σ).
[0141]
Further, the normal equation addition circuit 67 uses the prediction tap (student data) and the target pixel (teacher data) for each class corresponding to the class code supplied from the class classification circuit 66, and uses the vector of equation (8). Multiplying student data and teacher data (x_iny_i) And a calculation corresponding to summation (Σ).
[0142]
Note that the addition as described above in the normal equation adding circuit 67 is performed for each pixel position mode with respect to the pixel of interest for each class.
[0143]
The normal equation adding circuit 67 performs the above addition using all the pixels constituting the HD image as the teacher data supplied thereto as the target pixel, and for each class, for each pixel position mode, the formula ( Establish the normal equation shown in 8).
[0144]
The tap coefficient determination circuit 68 obtains 256 sets of tap coefficients for each class by solving the normal equation generated for each class (and for each pixel position mode) in the normal equation addition circuit 67, and stores the coefficient table in the coefficient table. The address is supplied to the address corresponding to each class in the unit 69.
[0145]
Depending on the number of images to be prepared as learning images, the contents of the images, and the like, there may occur a class in which the number of normal equations necessary for obtaining tap coefficients cannot be obtained in the normal equation adding circuit 67. Although possible, the tap coefficient determination circuit 68 outputs, for example, a default tap coefficient for such a class.
[0146]
The coefficient table storage unit 69 stores 256 sets of tap coefficients for each class supplied from the tap coefficient determination circuit 68.
[0147]
Next, processing (learning processing) of the learning device in FIG. 12 will be described with reference to the flowchart in FIG.
[0148]
An HD image, which is image data for learning, is supplied to the thinning circuit 60 as teacher data. In step S30, the thinning circuit 60 thins out the pixels of the HD image as the teacher data, and the horizontal and vertical pixels. An SD image is generated as semi-teacher data in which the numbers are both halved.
[0149]
In step S31, the blocking circuit 61 converts the SD image as the semi-teacher data obtained by the thinning circuit 60 into a pixel of 8 × 8 pixels as in the case of JPEG encoding by the encoder 21 (FIG. 2). The block is divided into blocks, and the process proceeds to step S32. In step S32, the DCT circuit 62 sequentially reads out the pixel blocks blocked by the blocking circuit 61, and the pixel blocks are subjected to DCT processing to form a block of DCT coefficients, and the process proceeds to step S33. In step S33, the quantization circuit 63 sequentially reads the blocks of the DCT coefficients obtained in the DCT circuit 62, quantizes them according to the same quantization table used for JPEG encoding in the encoder 21, and performs the quantization DCT. It is assumed that the block is composed of coefficients (DCT block).
[0150]
On the other hand, the normal equation addition circuit 67 is also supplied with the HD image as the teacher data, and the normal equation addition circuit 67 blocks the HD image into high-quality blocks of 16 × 16 pixels. Of the image quality blocks, those that have not yet been marked as high image quality blocks are set as high image quality blocks. Further, in step S34, the class tap extraction circuit 65 obtains a quantized DCT coefficient used for classifying the target high-quality block among the pixel blocks blocked by the blocking circuit 61 by the quantization circuit 63. A class tap is constructed by extracting from the obtained DCT block and supplied to the class classification circuit 66. In step S35, the class classification circuit 66 classifies the high-quality block of interest using the class tap from the class tap extraction circuit 65 in the same manner as described with reference to the flowchart of FIG. Is supplied to the normal equation adding circuit 67, and the process proceeds to step S36.
[0151]
In step S36, the normal equation addition circuit 67 sets a pixel that has not yet been set as the target pixel in the raster scan order among the pixels of the target high-quality block, and the prediction tap extraction circuit 64 sets the target pixel. 6 is configured by extracting the necessary quantized DCT coefficients from the output of the quantization circuit 63, the same prediction tap as the prediction tap extraction circuit 41 of FIG. Then, the prediction tap extraction circuit 64 supplies the prediction tap for the pixel of interest as student data to the normal equation addition circuit 67, and proceeds to step S37.
[0152]
In step S37, the normal equation adding circuit 67 targets the target pixel as the teacher data and the prediction tap (the quantized DCT coefficient constituting the student data) as the target data, and the matrix A of the equation (8) and the vector v. Addition as described above is performed. This addition is performed for each class corresponding to the class code from the class classification circuit 66 and for each pixel position mode for the target pixel.
[0153]
Then, the process proceeds to step S38, and the normal equation adding circuit 67 determines whether or not addition has been performed using all the pixels of the target high-quality block as the target pixel. If it is determined in step S38 that all the pixels of the target high-quality block have not yet been added as target pixels, the process returns to step S36, and the normal equation adding circuit 67 determines the pixels of the target high-quality block. Among these, in the raster scan order, a pixel that has not yet been set as the target pixel is newly set as the target pixel, and the same processing is repeated thereafter.
[0154]
If it is determined in step S38 that all pixels of the target high-quality block have been added as target pixels, the process proceeds to step S39, and the normal equation adding circuit 67 is obtained from the image as the teacher data. It is determined whether all the high-quality blocks have been processed as the noticeable high-quality blocks. If it is determined in step S39 that all the high-quality blocks obtained from the image as the teacher data have not yet been processed as the noticeable high-quality blocks, the process returns to step S34 and is still designated as the noticeable high-quality blocks. The high-quality block that has not been used is newly set as the high-quality block of interest, and the same processing is repeated thereafter.
[0155]
On the other hand, when it is determined in step S39 that all the high-quality blocks obtained from the image as the teacher data have been processed as the noticeable high-quality blocks, that is, in the normal equation addition circuit 67, for each class, When the normal equation for each position mode is obtained, the process proceeds to step S40, and the tap coefficient determination circuit 68 solves the normal equation generated for each pixel position mode of each class, thereby for each class, 256 sets of tap coefficients corresponding to each of the 256 pixel position modes are obtained, supplied to and stored in the coefficient table storage unit 69 at addresses corresponding to the respective classes, and the process ends.
[0156]
As described above, the tap coefficient for each class stored in the coefficient table storage unit 69 is stored in the coefficient table storage unit 44 of FIG.
[0157]
Therefore, the tap coefficient stored in the coefficient table storage unit 44 is such that the prediction error (here, the square error) of the prediction value of the original pixel value obtained by performing the linear prediction calculation is statistically minimized. As a result, according to the coefficient conversion circuit 32 shown in FIG. 6, a high-quality image close to the image quality of the HD image using the JPEG-encoded image as the teacher data is obtained. Can be decoded into
[0158]
Further, according to the coefficient conversion circuit 32, as described above, the decoding process of the JPEG-encoded image and the improvement process for improving the image quality are performed at the same time. A decoded image with high image quality can be efficiently obtained from the obtained image.
[0159]
Next, FIG. 14 shows a second configuration example of the coefficient conversion circuit 32 of FIG. In the figure, portions corresponding to those in FIG. 6 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. That is, the coefficient conversion circuit 32 in FIG. 14 is basically configured in the same manner as in FIG. 6 except that the inverse quantization circuit 71 is newly provided.
[0160]
In the embodiment of FIG. 14, the inverse quantization circuit 71 is supplied with quantized DCT coefficients for each block obtained by entropy decoding the encoded data in the entropy decoding circuit 31 (FIG. 3).
[0161]
As described above, the entropy decoding circuit 31 obtains a quantization table from the encoded data in addition to the quantized DCT coefficient. In the embodiment of FIG. 14, this quantization table is also entropy decoded. The circuit 31 is supplied to the inverse quantization circuit 71.
[0162]
The inverse quantization circuit 71 inversely quantizes the quantized DCT coefficient from the entropy decoding circuit 31 according to the quantization table also from the entropy decoding circuit 31, and the resulting DCT coefficient is converted into the prediction tap extraction circuit 41 and the class. This is supplied to the tap extraction circuit 42.
[0163]
Therefore, in the prediction tap extraction circuit 41 and the class tap extraction circuit 42, the prediction tap and the class tap are configured for the DCT coefficient instead of the quantized DCT coefficient, respectively. The same processing as in the case is performed.
[0164]
As described above, in the embodiment of FIG. 14, processing is performed not on the quantized DCT coefficient but on the DCT coefficient, so that the tap coefficient stored in the coefficient table storage unit 44 is different from that in FIG. 6. There is a need to.
[0165]
Therefore, FIG. 15 shows a configuration example of an embodiment of a learning apparatus that performs learning processing of tap coefficients to be stored in the coefficient table storage unit 44 of FIG. In the figure, portions corresponding to those in FIG. 12 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. That is, the learning apparatus in FIG. 15 is basically configured in the same manner as in FIG. 12 except that an inverse quantization circuit 81 is newly provided at the subsequent stage of the quantization circuit 63.
[0166]
In the embodiment of FIG. 15, the inverse quantization circuit 81 inversely quantizes the quantized DCT coefficient output from the inverse quantization circuit 63 in the same manner as the inverse quantization circuit 71 of FIG. Is supplied to the prediction tap extraction circuit 64 and the class tap extraction circuit 65.
[0167]
Therefore, in the prediction tap extraction circuit 64 and the class tap extraction circuit 65, the prediction tap and the class tap are configured for the DCT coefficient instead of the quantized DCT coefficient, respectively, and thereafter, the DCT coefficient is the target in FIG. The same processing as in the case is performed.
[0168]
As a result, the tap coefficient which reduces the influence of the quantization error which arises when a DCT coefficient is quantized and is further dequantized will be obtained.
[0169]
Next, FIG. 16 shows a third configuration example of the coefficient conversion circuit 32 of FIG. In the figure, portions corresponding to those in FIG. 6 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. That is, the coefficient conversion circuit 32 in FIG. 16 is basically configured in the same manner as in FIG. 6 except that the class tap extraction circuit 42 and the class classification circuit 43 are not provided.
[0170]
Therefore, in the embodiment of FIG. 16, there is no concept of class, but since this is also considered to be one class, the coefficient table storage unit 44 stores only one class of tap coefficients. This is used for processing.
[0171]
Thus, in the embodiment of FIG. 16, the tap coefficients stored in the coefficient table storage unit 44 are different from those in FIG.
[0172]
Therefore, FIG. 17 shows a configuration example of an embodiment of a learning apparatus that performs learning processing of tap coefficients to be stored in the coefficient table storage unit 44 of FIG. In the figure, portions corresponding to those in FIG. 12 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. That is, the learning apparatus in FIG. 17 is basically configured in the same manner as in FIG. 12 except that the class tap extraction circuit 65 and the class classification circuit 66 are not provided.
[0173]
Therefore, in the learning apparatus of FIG. 17, the above-described addition is performed for each pixel position mode in the normal equation adding circuit 67 regardless of the class. Then, the tap coefficient determination circuit 68 calculates the tap coefficient by solving the normal equation generated for each pixel position mode.
[0174]
Next, FIG. 18 shows a fourth configuration example of the coefficient conversion circuit 32 of FIG. In the figure, portions corresponding to those in FIG. 6 or FIG. 14 are given the same reference numerals, and description thereof will be omitted below as appropriate. That is, the coefficient conversion circuit 32 of FIG. 18 is basically the same as that of FIG. 6 except that the class tap extraction circuit 42 and the class classification circuit 43 are not provided and the inverse quantization circuit 71 is newly provided. The configuration is the same as in FIG.
[0175]
Accordingly, in the embodiment of FIG. 18, as in the above-described embodiment of FIG. 16, only one class of tap coefficients is stored in the coefficient table storage unit 44, and processing is performed using this. Done.
[0176]
Further, in the embodiment of FIG. 18, as in the case of the embodiment of FIG. 14, the prediction tap extraction circuit 41 targets not the quantized DCT coefficient but the DCT coefficient output by the inverse quantization circuit 71. A prediction tap is configured, and processing is performed for the DCT coefficients thereafter.
[0177]
Therefore, also in the embodiment of FIG. 18, the tap coefficients stored in the coefficient table storage unit 44 are different from those in FIG.
[0178]
Therefore, FIG. 19 shows a configuration example of an embodiment of a learning apparatus that performs learning processing of tap coefficients stored in the coefficient table storage unit 44 of FIG. In the figure, portions corresponding to those in FIG. 12 or FIG. 15 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. That is, the learning apparatus of FIG. 19 is basically the same as the case of FIG. 12 except that the class tap extraction circuit 65 and the class classification circuit 66 are not provided and the inverse quantization circuit 81 is newly provided. It is constituted similarly.
[0179]
Accordingly, in the learning device of FIG. 19, the prediction tap extraction circuit 64 configures a prediction tap not for the quantized DCT coefficient but for the DCT coefficient, and processing is performed for the DCT coefficient thereafter. Further, the above addition is performed regardless of the class in the normal equation adding circuit 67, and the tap coefficient is obtained by solving the normal equation generated regardless of the class in the tap coefficient determining circuit 68.
[0180]
Next, in the above description, a JPEG encoded image that compresses and encodes a still image is targeted. However, the present invention targets a moving image that is compressed and encoded, for example, an MPEG encoded image. Is also possible.
[0181]
That is, FIG. 20 shows a configuration example of the encoder 21 of FIG. 2 when MPEG encoding is performed.
[0182]
Frames (or fields) constituting a moving image to be MPEG-encoded are sequentially supplied to the motion detection circuit 91 and the calculator 92.
[0183]
The motion detection circuit 91 detects a motion vector in units of 16 × 16 pixel macroblocks for the frame supplied thereto, and supplies the motion vector to the entropy encoding circuit 96 and the motion compensation circuit 100.
[0184]
If the image supplied thereto is an I (Intra) picture, the arithmetic unit 92 supplies it to the blocking circuit 93 as it is, and if it is a P (Predictive) or B (Bidirectionally predictive) picture, the operation compensation circuit 100. The difference from the reference image supplied from is calculated, and the difference value is supplied to the blocking circuit 93.
[0185]
The blocking circuit 93 blocks the output of the computing unit 92 into 8 × 8 pixel pixel blocks and supplies the block to the DCT circuit 94. The DCT circuit 94 performs DCT processing on the pixel block from the blocking circuit 93 and supplies the resulting DCT coefficient to the quantization circuit 95. The quantization circuit 95 quantizes the DCT coefficients in block units from the DCT circuit 93 in a predetermined quantization step, and supplies the quantized DCT coefficients obtained as a result to the entropy encoding circuit 96. The entropy encoding circuit 96 entropy encodes the quantized DCT coefficient from the quantization circuit 95, adds a motion vector from the motion detection circuit 91 and other necessary information, and obtains encoded data ( For example, an MPEG transport stream) is output as an MPEG encoding result.
[0186]
Of the quantized DCT coefficients output from the quantizing circuit 95, the I picture and the P picture need to be locally decoded to be used as a reference picture of the P picture or B picture to be encoded later. In addition to the circuit 96, it is also supplied to the inverse quantization circuit 97.
[0187]
The inverse quantization circuit 97 converts the quantized DCT coefficient from the quantization circuit 95 into a DCT coefficient by inverse quantization, and supplies the DCT coefficient to the inverse DCT circuit 98. The inverse DCT circuit 98 performs inverse DCT processing on the DCT coefficient from the inverse quantization circuit 97 and outputs it to the computing unit 99. In addition to the output of the inverse DCT circuit 98, the arithmetic unit 99 is also supplied with a reference image output from the motion compensation circuit 100. The arithmetic unit 99 outputs the output of the inverse DCT circuit 98 to the P picture. If it is, the output and the output of the motion compensation circuit 100 are added to decode the original image and supply it to the motion compensation circuit 100. Further, when the output of the inverse DCT circuit 98 is that of the I picture, the arithmetic unit 99 supplies the motion compensation circuit 100 as it is because the output is a decoded image of the I picture.
[0188]
The motion compensation circuit 100 performs motion compensation on the locally decoded image supplied from the computing unit 99 according to the motion vector from the motion detection circuit 91, and uses the image after motion compensation as a reference image. Are supplied to the

calculators

92 and 99.
[0189]
Here, FIG. 21 shows a configuration of an example of a conventional MPEG decoder that decodes encoded data obtained as a result of the above MPEG encoding.
[0190]
The encoded data is supplied to the entropy decoding circuit 111. The entropy decoding circuit 111 performs entropy decoding on the encoded data to obtain quantized DCT coefficients, motion vectors, and other information. The quantized DCT coefficient is supplied to the inverse quantization circuit 112, and the motion vector is supplied to the motion compensation circuit 116.
[0191]
The inverse quantization circuit 112 converts the quantized DCT coefficient from the entropy decoding circuit 111 into a DCT coefficient by inverse quantization, and supplies the DCT coefficient to the inverse DCT circuit 113. The inverse DCT circuit 113 performs inverse DCT processing on the DCT coefficient from the inverse quantization circuit 112 and outputs the result to the calculator 114. In the arithmetic unit 114, in addition to the output of the inverse quantization circuit 113, an already decoded I picture or P picture output from the motion compensation circuit 116 is motion-compensated according to the motion vector from the entropy decoding circuit 111. When the output of the inverse DCT circuit 113 is that of a P or B picture, the arithmetic unit 114 adds the output and the output of the motion compensation circuit 100. As a result, the original image is decoded and supplied to the block decomposition circuit 115. Further, when the output of the inverse DCT circuit 113 is that of the I picture, the arithmetic unit 114 supplies the output to the block decomposition circuit 115 as it is because the output is a decoded image of the I picture.
[0192]
The block decomposition circuit 115 obtains and outputs a decoded image by unblocking the decoded image supplied from the computing unit 114 in units of pixel blocks.
[0193]
On the other hand, the motion compensation circuit 116 receives the I picture and the P picture of the decoded image output from the computing unit 114 and performs motion compensation according to the motion vector from the entropy decoding circuit 111. Then, the motion compensation circuit 116 supplies the motion-compensated image to the computing unit 114 as a reference image.
[0194]
The decoder 22 shown in FIG. 3 can efficiently decode MPEG-encoded encoded data into a high-quality image as described above.
[0195]
In other words, the encoded data is supplied to the entropy decoding circuit 31, and the entropy decoding circuit 31 performs entropy decoding on the encoded data. Quantized DCT coefficients, motion vectors, and other information obtained as a result of this entropy decoding are supplied from the entropy decoding circuit 31 to the coefficient conversion circuit 32.
[0196]
The coefficient conversion circuit 32 performs a predetermined prediction calculation using the quantized DCT coefficient Q from the entropy decoding circuit 31 and the tap coefficient obtained by learning, and converts the motion vector from the entropy decoding circuit 31 into a motion vector. Accordingly, by performing motion compensation as necessary, the quantized DCT coefficient is decoded into a high-quality pixel value, and a high-quality block composed of the high-quality pixel value is supplied to the block decomposition circuit 33.
[0197]
The block decomposition circuit 33 unblocks the high-quality block obtained in the coefficient conversion circuit 32 so that both the horizontal and vertical pixel numbers are, for example, twice as high as that of an MPEG-encoded image. Obtain and output a decoded image of image quality.
[0198]
Next, FIG. 22 shows a configuration example of the coefficient conversion circuit 32 in FIG. 3 when decoding data encoded by the MPEG in the decoder 22. In the figure, portions corresponding to those in FIG. 18 or FIG. 21 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate. That is, the coefficient conversion circuit 32 in FIG. 22 is basically configured in the same manner as in FIG. 18 except that the arithmetic unit 114 and the motion compensation circuit 116 in FIG. Has been.
[0199]
Therefore, in the coefficient conversion circuit 32 of FIG. 22, the quantized DCT coefficient is inversely quantized by the inverse quantization circuit 71, and a prediction tap is configured in the prediction tap extraction circuit 41 using the resulting DCT coefficient. . Then, the product-sum operation circuit 45 performs a prediction operation using the prediction tap and the tap coefficient stored in the coefficient table storage unit 44, so that the number of horizontal and vertical pixels is the same as that of the original image. Outputs doubled high-quality data.
[0200]
Then, the arithmetic unit 114 adds the output of the product-sum operation circuit 45 with the output of the motion compensation circuit 116 as necessary, so that the number of horizontal and vertical pixels is twice that of the original image. The resulting high-quality image is decoded and output to the block decomposition circuit 33 (FIG. 3).
[0201]
That is, for an I picture, the output of the product-sum operation circuit 45 is a high-quality image in which the number of horizontal and vertical pixels is twice that of the original image. The output of the product-sum operation circuit 45 is output to the block decomposition circuit 33 as it is.
[0202]
For the P or B picture, the product-sum operation circuit 45 outputs the high-quality image in which the number of horizontal and vertical pixels is twice that of the original image and the high-quality reference image. Since the difference is the difference, the calculator 114 adds the output of the product-sum calculation circuit 45 to the high-quality reference image supplied from the motion compensation circuit 116, so that the number of horizontal and vertical pixels is The image is decoded into a high-quality image that is twice the original image, and is output to the block decomposition circuit 33.
[0203]
On the other hand, the motion compensation circuit 116 receives the I and P pictures among the high-quality decoded images output from the computing unit 114, and applies the entropy decoding circuit 31 () to the high-quality decoded images of the I or P pictures. By performing motion compensation using the motion vector from FIG. 3, a high-quality reference image is obtained and supplied to the computing unit 114.
[0204]
Here, since the number of pixels in the horizontal and vertical directions of the decoded image is twice that of the original image, the motion compensation circuit 116, for example, in the horizontal direction of the motion vector from the entropy decoding circuit 31 The motion compensation is performed according to the motion vector in which the size in the vertical direction is doubled.
[0205]
Next, FIG. 23 shows a configuration example of an embodiment of a learning apparatus that learns tap coefficients to be stored in the coefficient table storage unit 44 of FIG. In the figure, portions corresponding to those in FIG. 19 are denoted by the same reference numerals, and description thereof will be omitted below as appropriate.
[0206]
A learning HD image is input to the thinning circuit 120 as teacher data. The thinning circuit 120 thins out pixels of the HD image as teacher data, for example, similarly to the thinning circuit 60 in FIG. Associate teacher data, which is an SD image in which the number of vertical pixels is halved, is generated. Then, the SD image as the semi-teacher data is supplied to the motion vector detection circuit 121 and the calculator 122.
[0207]
The motion vector detection circuit 121, the arithmetic unit 122, the blocking circuit 123, the DCT circuit 124, the quantization circuit 125, the inverse quantization circuit 127, the inverse DCT circuit 128, the arithmetic unit 129, or the motion compensation circuit 130 are shown in FIG. The same processing as that performed by the vector detection circuit 91, the arithmetic unit 92, the blocking circuit 93, the DCT circuit 94, the quantization circuit 95, the inverse quantization circuit 97, the inverse DCT circuit 98, the arithmetic unit 99, or the motion compensation circuit 100 is performed. Thereby, the quantization circuit 125 outputs the same quantized DCT coefficient as that output from the quantization circuit 95 of FIG.
[0208]
The quantized DCT coefficient output from the quantizing circuit 125 is supplied to the inverse quantizing circuit 81. The inverse quantizing circuit 81 inversely quantizes the quantized DCT coefficient from the quantizing circuit 125 and converts it into a DCT coefficient. To the prediction tap extraction circuit 64. The prediction tap extraction circuit 64 constitutes a prediction tap from the DCT coefficients from the inverse quantization circuit 81 and supplies the prediction tap to the normal equation addition circuit 67 as student data.
[0209]
On the other hand, the HD image as teacher data is supplied to the computing unit 132 in addition to the thinning circuit 120. The computing unit 132 subtracts the output of the interpolation circuit 131 from the HD image as the teacher data as necessary, and supplies it to the normal equation addition circuit 67.
[0210]
That is, the interpolation circuit 131 generates a high-quality reference image in which the number of horizontal and vertical pixels of the reference image of the SD image output from the motion compensation circuit 130 is doubled, and supplies the generated reference image to the calculator 132.
[0211]
When the HD image supplied thereto is an I picture, the computing unit 132 supplies the HD image of the I picture as it is to the normal equation adding circuit 67 as teacher data. Further, when the HD image supplied thereto is a P or B picture, the computing unit 132 calculates a difference between the HD image of the P or B picture and the high-quality reference image output from the interpolation circuit 131. By calculating, a high quality image of the difference of the SD image (semi-teacher data) output from the calculator 122 is obtained, and this is output to the normal equation adding circuit 67 as teacher data.
[0212]
Note that the interpolation circuit 131 can increase the number of pixels by simple interpolation, for example. Further, in the interpolation circuit 131, for example, the number of pixels can be increased by class classification adaptive processing. Further, the computing unit 132 can use an HD image as teacher data which has been MPEG-encoded and subjected to local decoding to compensate for motion as a reference image.
[0213]
The normal equation adder circuit 67 uses the output of the computing unit 132 as teacher data and performs addition as described above using the prediction tap from the inverse quantization circuit 81 as student data, thereby generating a normal equation. .
[0214]
Then, the tap coefficient determination circuit 68 obtains a tap coefficient by solving the normal equation generated by the normal equation addition circuit 67 and supplies the tap coefficient to the coefficient table storage unit 69 for storage.
[0215]
In the product-sum operation circuit 45 shown in FIG. 22, the encoded data encoded by the MPEG is decoded using the tap coefficient thus obtained. The processing for improving the image quality can be performed at the same time. Therefore, from the MPEG-encoded image, the number of pixels in the horizontal and vertical directions can be efficiently and efficiently improved. In both cases, a decoded image that is a doubled HD image can be obtained.
[0216]
Note that the coefficient conversion circuit 32 in FIG. 22 can be configured without the inverse quantization circuit 71. In this case, the learning apparatus in FIG. 23 may be configured without providing the inverse quantization circuit 81.
[0217]
Also, the coefficient conversion circuit 32 in FIG. 22 can be configured by providing a class tap extraction circuit 42 and a class classification circuit 43 as in the case of FIG.
In this case, the learning device of FIG. 23 may be configured by providing the class tap extraction circuit 65 and the class classification circuit 66 as in the case of FIG.
[0218]
Further, in the above-described case, the decoder 22 (FIG. 3) obtains a decoded image in which the spatial resolution of the original image is improved by a factor of 2, but the decoder 22 can arbitrarily set the spatial resolution of the original image. It is also possible to obtain a decoded image that is a multiple of, or a decoded image in which the time resolution of the original image is improved.
[0219]
That is, for example, when the image to be MPEG-encoded has a low temporal resolution as shown in FIG. 24A, the decoder 22 converts the encoded data obtained by MPEG-encoding the image into FIG. As shown in (B), it is possible to decode the original image into an image in which the time resolution is doubled. Further, for example, when the image to be MPEG-encoded is an image of 24 frames / second used in a movie as shown in FIG. 25A, the decoder 22 MPEG-encoded the image. It is possible to decode the encoded data into an image of 60 frames / second in which the temporal resolution of the original image is 60/24 times as shown in FIG. In this case, so-called 2-3 pull-down can be easily performed.
[0220]
Here, as described above, when the temporal resolution is improved in the decoder 22, the prediction tap and the class tap are configured by DCT coefficients of two or more frames as shown in FIG. 26, for example. It is possible.
[0221]
Further, the decoder 22 can obtain a decoded image in which not only one of the spatial resolution and the temporal resolution but both are improved.
[0222]
Next, the series of processes described above can be performed by hardware or software. When a series of processing is performed by software, a program constituting the software is installed in a general-purpose computer or the like.
[0223]
Therefore, FIG. 27 shows a configuration example of an embodiment of a computer in which a program for executing the series of processes described above is installed.
[0224]
The program can be recorded in advance in a hard disk 205 or ROM 203 as a recording medium built in the computer.
[0225]
Alternatively, the program is temporarily or temporarily stored on a removable recording medium 211 such as a floppy disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto optical) disk, a DVD (Digital Versatile Disc), a magnetic disk, or a semiconductor memory. It can be stored permanently (recorded). Such a removable recording medium 211 can be provided as so-called package software.
[0226]
The program is installed on the computer from the removable recording medium 211 as described above, or transferred from the download site to the computer wirelessly via a digital satellite broadcasting artificial satellite, LAN (Local Area Network), The program can be transferred to a computer via a network such as the Internet. The computer can receive the program transferred in this way by the communication unit 208 and install it in the built-in hard disk 205.
[0227]
The computer includes a CPU (Central Processing Unit) 202. An input / output interface 210 is connected to the CPU 202 via the bus 201, and the CPU 202 operates the input unit 207 including a keyboard, a mouse, a microphone, and the like by the user via the input / output interface 210. When a command is input as a result of this, the program stored in a ROM (Read Only Memory) 203 is executed accordingly. Alternatively, the CPU 202 transfers a program stored in the hard disk 205, a program transferred from a satellite or a network, received by the communication unit 208 and installed in the hard disk 205, or a removable recording medium 211 attached to the drive 209. The program read and installed in the hard disk 205 is loaded into a RAM (Random Access Memory) 204 and executed. Thereby, the CPU 202 performs processing according to the above-described flowchart or processing performed by the configuration of the above-described block diagram. Then, the CPU 202 outputs the processing result from the output unit 206 configured with an LCD (Liquid Crystal Display), a speaker, or the like, for example, via the input / output interface 210, or from the communication unit 208 as necessary. Transmission, and further recording on the hard disk 205 is performed.
[0228]
Here, in this specification, the processing steps for describing a program for causing a computer to perform various types of processing do not necessarily have to be processed in time series according to the order described in the flowchart, but in parallel or individually. This includes processing to be executed (for example, parallel processing or processing by an object).
[0229]
Further, the program may be processed by a single computer, or may be processed in a distributed manner by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.
[0230]
In the present embodiment, image data is targeted, but the present invention can also be applied to, for example, audio data.
[0231]
In the present embodiment, at least JPEG encoding for performing DCT processing or decoding of MPEG encoded encoded data is performed. However, the present invention is converted by other orthogonal transformation or frequency transformation. Applicable to data decoding. That is, the present invention can be applied to, for example, decoding sub-band encoded data, Fourier transformed data, or the like.
[0232]
Furthermore, in the present embodiment, the tap coefficients used for decoding are stored in advance in the decoder 22, but the tap coefficients may be included in the encoded data and provided to the decoder 22. Is possible.
[0233]
In this embodiment, decoding is performed by linear primary prediction calculation using tap coefficients. However, decoding can also be performed by second-order or higher-order prediction calculation.
[0234]
【The invention's effect】
According to the first data processing device, the data processing method, and the recording medium of the present invention, the tap coefficient obtained by learning is acquired, and a predetermined prediction calculation is performed using the tap coefficient and the converted data. As a result, the converted data is decoded into the original data, and processed data obtained by performing a predetermined process on the original data is obtained. Therefore, it is possible to efficiently decode the converted data and perform a predetermined process on the decoded data.
[0235]
According to the second data processing device, the data processing method, and the recording medium of the present invention, the teacher data serving as a teacher is subjected to a process based on a predetermined process, and the resulting semi-teacher data is at least orthogonal By performing the conversion or frequency conversion, student data to be students is generated. Then, learning is performed so that the prediction error of the predicted value of the teacher data obtained by performing the prediction calculation using the tap coefficient and the student data is statistically minimized, and the tap coefficient is obtained. Therefore, by using the tap coefficient, it is possible to efficiently decode data subjected to orthogonal transform or frequency transform, and to perform predetermined processing on the decoded data.
[Brief description of the drawings]
FIG. 1 is a diagram for explaining conventional JPEG encoding / decoding.
FIG. 2 is a diagram showing a configuration example of an embodiment of an image transmission system to which the present invention is applied.
FIG. 3 is a block diagram illustrating a configuration example of a decoder 22 in FIG. 2;
FIG. 4 is a diagram illustrating how an 8 × 8 DCT coefficient is decoded into 16 × 16 pixels.
FIG. 5 is a flowchart for explaining processing of the decoder 22 of FIG. 3;
6 is a block diagram illustrating a first configuration example of a coefficient conversion circuit 32 in FIG. 3; FIG.
FIG. 7 is a diagram illustrating an example of a prediction tap and a class tap.
8 is a block diagram illustrating a configuration example of a class classification circuit 43 in FIG. 6;
FIG. 9 is a diagram for explaining processing of the power calculation circuit 51 of FIG. 6;
10 is a flowchart for explaining processing of the coefficient conversion circuit 32 of FIG. 6;
FIG. 11 is a flowchart for explaining more details of the processing in step S12 of FIG.
FIG. 12 is a block diagram illustrating a configuration example of a first embodiment of a learning device to which the present invention has been applied.
13 is a flowchart illustrating processing of the learning device in FIG.
14 is a block diagram showing a second configuration example of the coefficient conversion circuit 32 of FIG. 3; FIG.
FIG. 15 is a block diagram illustrating a configuration example of a second embodiment of a learning device to which the present invention has been applied.
16 is a block diagram illustrating a third configuration example of the coefficient conversion circuit 32 of FIG. 3;
FIG. 17 is a block diagram illustrating a configuration example of a third embodiment of a learning device to which the present invention has been applied.
18 is a block diagram illustrating a fourth configuration example of the coefficient conversion circuit 32 of FIG. 3. FIG.
FIG. 19 is a block diagram illustrating a configuration example of a fourth embodiment of a learning device to which the present invention has been applied.
20 is a block diagram illustrating a configuration example of an encoder 21 in FIG. 2. FIG.
FIG. 21 is a block diagram showing a configuration of an example of an MPEG decoder.
22 is a block diagram illustrating a fifth configuration example of the coefficient conversion circuit 32 of FIG. 3;
FIG. 23 is a block diagram illustrating a configuration example of a fifth embodiment of a learning device to which the present invention has been applied.
FIG. 24 is a diagram illustrating an image with improved temporal resolution.
FIG. 25 is a diagram illustrating an image with improved temporal resolution.
FIG. 26 is a diagram showing that a class tap and a prediction tap are configured from DCT coefficients of two or more frames.
FIG. 27 is a block diagram illustrating a configuration example of an embodiment of a computer to which the present invention has been applied.
[Explanation of symbols]
21 encoder, 22 decoder, 23 recording medium, 24 transmission medium, 31 entropy decoding circuit, 32 coefficient conversion circuit, 33 block decomposition circuit, 41 prediction tap extraction circuit, 42 class tap extraction circuit, 43 class classification circuit, 44 coefficient table storage Unit, 45 product-sum operation circuit, 51 power operation circuit, 52 class code generation circuit, 53 threshold value table storage unit, 60 decimation circuit, 61 blocking circuit, 62 DCT circuit, 63 quantization circuit, 64 prediction tap extraction circuit, 65 Class tap extraction circuit, 66 class classification circuit, 67 normal equation addition circuit, 68 tap coefficient determination circuit, 69 coefficient table storage unit, 71, 81 inverse quantization circuit, 114 calculator, 115 motion compensation circuit, 120 decimation circuit, 121Motion vector detection circuit, 122 arithmetic unit, 123 blocking circuit, 124 DCT circuit, 125 quantization circuit, 127 inverse quantization circuit, 128 inverse DCT circuit, 129 arithmetic unit, 130 motion compensation circuit, 131 interpolation circuit, 132 arithmetic unit , 201 bus, 202 CPU, 203 ROM, 204 RAM, 205 hard disk, 206 output unit, 207 input unit, 208 communication unit, 209 drive, 210 input / output interface, 211 removable recording medium

Claims

At least a data processing device for processing conversion data obtained by performing orthogonal conversion processing or frequency conversion processing,
Obtaining means for obtaining the tap coefficient obtained by performing learning;
Using the tap coefficients and the converted data by performing a predetermined prediction computation, the conversion data simultaneously with the decoding to the original data, Ru obtain a processed data subjected to the predetermined processing on the original data A data processing apparatus comprising: an arithmetic means.

The data processing apparatus according to claim 1, wherein the calculation unit performs linear primary prediction calculation using the tap coefficient and conversion data.

A storage means for storing the tap coefficient;
The data processing apparatus according to claim 1, wherein the acquisition unit acquires the tap coefficient from the storage unit.

The data processing apparatus according to claim 1, wherein the transformed data is obtained by subjecting the original data to orthogonal transformation or frequency transformation and further quantization.

Further comprising inverse quantization means for inversely quantizing the transformed data;
The data processing apparatus according to claim 4, wherein the calculation unit performs a prediction calculation using the transformed data that has been dequantized.

The data processing apparatus according to claim 1, wherein the transformed data is obtained by performing at least discrete cosine transform on the original data.

A prediction tap extracting means for extracting the conversion data used together with the tap coefficient to predict the attention data of interest in the processing data, and outputting the prediction data as a prediction tap;
The data processing apparatus according to claim 1, wherein the calculation unit performs a prediction calculation using the prediction tap and a tap coefficient.

Class tap extraction means for extracting the conversion data used for classifying the attention data into any of several classes, and outputting the data as class taps;
Class classification means for performing class classification for obtaining a class of the data of interest based on the class tap; and
The data processing apparatus according to claim 7, wherein the calculation unit performs a prediction calculation using the prediction tap and the tap coefficient corresponding to the class of the target data.

The data processing apparatus according to claim 1, wherein the calculation unit obtains the processing data obtained by performing processing for improving the quality of the original data by performing the predetermined prediction calculation.

By performing learning so that the prediction error of the predicted value of the processing data obtained by performing a predetermined prediction calculation using the tap coefficient and the conversion data is statistically minimized. The data processing apparatus according to claim 1, wherein the data processing apparatus is obtained.

The data processing apparatus according to claim 1, wherein the original data is image data of a moving image or a still image.

The data processing apparatus according to claim 11, wherein the calculation unit obtains the processing data obtained by performing processing for improving the image quality of the image data by performing the predetermined prediction calculation.

The data processing apparatus according to claim 11, wherein the calculation unit obtains the processing data in which resolution of the image data in time or space direction is improved.

A data processing method for processing conversion data obtained by performing at least orthogonal conversion processing or frequency conversion processing,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
Using the tap coefficients and the converted data by performing a predetermined prediction computation, the conversion data simultaneously with the decoding to the original data, Ru obtain a processed data subjected to the predetermined processing on the original data A data processing method comprising: an arithmetic step.

A recording medium on which a program for causing a computer to perform data processing for processing conversion data obtained by performing at least orthogonal transformation processing or frequency conversion processing is recorded,
An acquisition step of acquiring a tap coefficient obtained by performing learning;
Using the tap coefficients and the converted data by performing a predetermined prediction computation, the conversion data simultaneously with the decoding to the original data, Ru obtain a processed data subjected to the predetermined processing on the original data A recording medium on which a program comprising: an arithmetic step is recorded.

At least, at the same time to decode the converted data obtained by performing orthogonal transformation processing or frequency conversion processing, learning the tap coefficient used for prediction calculation for Ru obtain a processed data subjected to the predetermined processing on the decoded result A data processing device,
A semi-teacher data generating means for obtaining semi-teacher data by performing a process based on the predetermined process on the teacher data to be a teacher;
Student data generation means for generating student data to be students by performing orthogonal transform or frequency transform on the semi-teacher data at least;
Learning so that the prediction error of the predicted value of the teacher data obtained by performing a prediction calculation using the tap coefficient and student data is statistically minimized,
A data processing apparatus comprising: learning means for obtaining the tap coefficient.

The learning means performs learning so that a prediction error of a predicted value of the teacher data obtained by performing a linear primary prediction calculation using the tap coefficient and student data is statistically minimized. The data processing apparatus according to claim 16.

The data processing apparatus according to claim 16, wherein the student data generation unit generates the student data by performing orthogonal transform or frequency transform on the semi-teacher data and further quantizing the data.

The data processing according to claim 16, wherein the student data generation unit generates the student data by quantizing the semi-teacher data by orthogonal transform or frequency transform, and further performing inverse quantization. apparatus.

The data processing apparatus according to claim 16, wherein the student data generation unit generates the student data by performing at least discrete cosine transform on the semi-teacher data.

A prediction tap extraction unit that extracts the student data used together with the tap coefficient to predict attention teacher data of interest among the teacher data, and outputs the student data as a prediction tap;
The learning unit performs learning so that a prediction error of a prediction value of the teacher data obtained by performing a prediction calculation using the prediction tap and a tap coefficient is statistically minimized. Item 17. A data processing apparatus according to Item 16.

Class tap extraction means for extracting the student data used for classifying the attention teacher data into any of several classes, and outputting as class taps;
Class classification means for performing class classification for obtaining a class of the teacher data of interest based on the class tap; and
The learning means is configured to statistically minimize a prediction error of a predicted value of the teacher data obtained by performing a prediction calculation using a tap coefficient corresponding to the prediction tap and the class of the teacher data of interest. The data processing apparatus according to claim 21, wherein learning is performed to obtain the tap coefficient for each class.

The data processing according to claim 16, wherein the student data generation means generates the student data by performing at least an orthogonal transform process or a frequency transform on the semi-teacher data for each predetermined unit. apparatus.

The data processing apparatus according to claim 16, wherein the semi-teacher data generating unit generates the semi-teacher data by performing a process of degrading the quality of the teacher data.

The data processing apparatus according to claim 16, wherein the teacher data is image data of a moving image or a still image.

The data processing apparatus according to claim 25, wherein the semi-teacher data generation unit generates the semi-teacher data by performing a process of degrading the image quality of the image data.

26. The data processing apparatus according to claim 25, wherein the semi-teacher data generation unit generates the semi-teacher data in which the resolution of the image data in time or space direction is deteriorated.

At least, at the same time to decode the converted data obtained by performing orthogonal transformation processing or frequency conversion processing, learning the tap coefficient used for prediction calculation for Ru obtain a processed data subjected to the predetermined processing on the decoded result A data processing method,
A semi-teacher data generation step for obtaining semi-teacher data by performing a process based on the predetermined process on the teacher data to be a teacher;
Student data generation step for generating student data to be students by at least orthogonal transformation or frequency transformation of the semi-teacher data;
Learning so that the prediction error of the predicted value of the teacher data obtained by performing a prediction calculation using the tap coefficient and student data is statistically minimized,
And a learning step for obtaining the tap coefficient.

At least, at the same time to decode the converted data obtained by performing orthogonal transformation processing or frequency conversion processing, learning the tap coefficient used for prediction calculation for Ru obtain a processed data subjected to the predetermined processing on the decoded result A recording medium on which a program for causing a computer to perform data processing is recorded,
A semi-teacher data generation step for obtaining semi-teacher data by performing a process based on the predetermined process on the teacher data to be a teacher;
Student data generation step for generating student data to be students by at least orthogonal transformation or frequency transformation of the semi-teacher data;
Learning so that the prediction error of the predicted value of the teacher data obtained by performing a prediction calculation using the tap coefficient and student data is statistically minimized,
A recording medium is recorded, comprising: a learning step for obtaining the tap coefficient.