JP4337186B2

JP4337186B2 - Image information conversion apparatus, image information conversion method, learning apparatus, and learning method

Info

Publication number: JP4337186B2
Application number: JP30948499A
Authority: JP
Inventors: 哲二郎近藤; 寿一白木; 秀雄中屋; 靖立平; 俊彦浜松; 隆也星野; 正明服部
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1998-10-29
Filing date: 1999-10-29
Publication date: 2009-09-30
Anticipated expiration: 2019-10-29
Also published as: JP2000200349A

Description

【０００１】
【発明の属する技術分野】
この発明は、入力画像信号のぼけを改善するための画像情報変換装置および画像情報変換方法、学習装置および学習方法に関する。
【０００２】
【従来の技術】
近年、クラス分類適応処理を用いてぼけを改善することにより、画質改善を行う処理が研究開発されている。クラス分類適応処理は、入力画像信号の信号レベル分布に応じてクラス分類を行い、クラス毎に予め学習によって獲得された予測係数値を所定の記憶部に格納し、かかる予測係数値を使用した重み付け加算式によって注目画素の画素値として最適な推定値を出力する処理である。
【０００３】
この場合、注目画素についてのクラス分類を行う際に必要なデータを得るために当該注目画素の周辺に配置されるクラスタップや、当該注目画素の画素値を推定する演算を行う際に必要なデータを得るために当該注目画素の周辺に配置される予測タップを、適応的に、すなわち注目画素毎に最適な間引き間隔で配置することができれば、そのような間引き間隔についての制御を行わない場合に比べ、大幅な画質改善が可能となる。しかしながら、そのような制御は実現されていなかった。
【０００４】
【発明が解決しようとする課題】
従って、この発明の目的は、クラス分類適応処理を用いて画質改善を行う際に、画質改善の程度を向上させることが可能な、画像情報変換装置および画像情報変換方法、学習装置および学習方法を提供することにある。
【０００５】
【課題を解決するための手段】
請求項１の発明は、入力画像信号を、相異なる複数種類の間引き間隔で間引いた画像から、夫々予測画素値を生成する予測画像信号生成手段と、
予測画像信号生成手段から出力される複数の予測画素値の内の最大値および最小値の一方を、入力画像信号の特徴に基づいて選択的に出力する最適画素値出力手段とを備え、
最適画素値出力手段は、
入力画像信号から、注目画素の周辺に指定される所定位置の画素値を統計的データ用クラスタップとして出力する統計的データ用クラスタップ抽出手段と、
統計的データ用クラスタップ抽出手段の出力に基づいて、統計的データ用クラスコードを発生する統計的データ用クラスコード発生手段と、
統計的データを予め記憶し、統計的データ用クラスコード発生手段の出力に対応する統計的データを出力する統計的データ記憶手段と、
統計的データ記憶手段の出力に基づいて、最大値および最小値の内から、最適な値を出力する出力値決定手段とを備えることを特徴とする画像情報変換装置である。
【０００６】
請求項４の発明は、入力画像信号を、相異なる複数種類の間引き間隔で間引いた画像から、夫々予測画素値を生成する予測画像信号生成ステップと、
予測画像信号生成ステップによって生成される複数の予測画素値の内の最大値および最小値の一方を、入力画像信号の特徴に基づいて選択的に出力する最適画素値出力ステップとを備え、
最適画素値出力ステップは、
入力画像信号から、注目画素の周辺に指定される所定位置の画素値を統計的データ用クラスタップとして出力する統計的データ用クラスタップ抽出ステップと、
統計的データ用クラスタップ抽出ステップの出力に基づいて、統計的データ用クラスコードを発生する統計的データ用クラスコード発生ステップと、
予め記憶されている統計的データから、統計的データ用クラスコード発生ステップにより発生される統計的データ用クラスコードに対応する統計的データを出力する統計的データ出力ステップと、
統計的データ出力ステップの出力に基づいて、最大値および最小値の内から、最適な値を出力する出力値決定ステップとを備えることを特徴とする画像情報変換方法である。
【０００７】
請求項５の発明は、入力画像信号を劣化させる信号劣化処理手段と、
信号劣化処理手段から出力される画像を相異なる複数種類の間引き間隔で間引いた画像から、夫々予測画素値を生成する予測画像信号生成手段と、
入力画像信号を参照して、予測画像信号生成手段から出力される複数の予測画素値の内の最大値および最小値の一方を選択的に出力する最適画素値出力手段と、
最適画素値出力手段の出力の内、所定の条件を満たすものの個数を、当該画素が最大値、最小値の内の何れであったかの区分、および信号劣化処理手段から出力される画像の特徴を表現する情報を発生する手段の出力に対応して計数する計数手段とを備えることを特徴とする学習装置である。
【０００８】
請求項１１の発明は、入力画像信号を劣化させる信号劣化処理ステップと、
信号劣化処理ステップによって生成される画像を相異なる複数種類の間引き間隔で間引いた画像から、夫々予測画素値を生成する予測画像信号生成ステップと、
入力画像信号を参照して、予測画像信号生成ステップによって生成される複数の予測画素値の内の最大値および最小値の一方を選択的に出力する最適画素値出力ステップと、
最適画素値出力ステップの結果の内、所定の条件を満たすものの個数を、当該画素が最大値、最小値の内の何れであったかの区分、および信号劣化処理ステップによって生成される画像の特徴を表現する情報を発生するステップの出力に対応して計数するステップとを備えることを特徴とする学習方法である。
【０００９】
以上のような発明によれば、互いに異なる間引き間隔を有する複数種類のクラスタップ構造に基づくクラス分類適応処理の結果の内で最適なものが出力画像信号として選択される。
【００１０】
【発明の実施の形態】
以下、この発明の一実施形態について説明する。なお、クラス分類適応処理は、入力画像信号の信号レベル分布等の入力画像信号の特徴データに応じてクラス分類を行い、クラス毎に予め学習によって獲得された予測係数値を所定の記憶部に格納し、かかる予測係数値を使用した重み付け加算式によって注目画素の画素値として最適な推定値を出力する処理である。
【００１１】
入力画像信号の信号レベル分布を把握してクラス分類を行うために、入力画像信号上にクラスタップが配置される。クラスタップは、入力画像信号の信号レベル分布を表現する特徴を計算するためのデータとして、入力画像信号の所定位置から抽出される画素値である。クラスタップは、例えば、注目画素を含むフレーム上と、その前後のフレーム上とに配置される。注目画素を含むフレーム上におけるクラスタップ構造の一例を図１に示す。ここで、図１Ａは、間引き間隔１すなわち注目画素に対して縦／横に隣接する、画素位置にクラスタップが配置される。すなわち、間引き間隔１の場合、間引きがなされないことになる。また、図１Ｂは、間引き間隔２すなわち注目画素に対して縦／横に２番目の画素位置にクラスタップが配置される。さらに、図１Ｃは、間引き間隔３すなわち注目画素に対して縦／横に３番目の画素位置にクラスタップが配置される。同様に、間引き間隔４、５等、間引き間隔をさらに広げたクラスタップ配置を用いることもできる。ぼけ画像においては、ぼけの範囲の広がりの程度に応じて、何れの間引き間隔の下でクラス分類を行うのが最適であるかが異なる。
【００１２】
以下、この発明の一実施形態についての説明に先立って、複数種類の間引き量の下で設定されたクラスタップ配置において、クラス分類適応処理によるぼけ画像の画質改善を行うための、既に提案されている構成について説明する。まず、学習、すなわち予測係数を得るための処理に係る構成の一例を図２に示す。ぼけを含まない、所定の入力画像信号（教師信号と称される）がＬＰＦ(Low Pass Filter) 回路１０と、領域切り出し回路１１とに供給される。ＬＰＦ回路１０は、入力画像信号にＬＰＦ処理を施して、劣化した（ぼけの生じた）画像信号を生成し、劣化した画像信号を領域切り出し回路１２、１３および領域切り出し回路２２、２３、並びに領域切り出し回路３２、３３に供給する。
【００１３】
領域切り出し回路１２は、劣化した画像信号から所定範囲の画素領域を予測タップとして切り出し、切り出した領域を正規方程式加算回路１６に供給する。また、領域切り出し回路１３は、図１Ａに示したような間引き間隔１のクラスタップ配置の下で所定範囲の画素領域をクラスタップとして切り出し、切り出した領域を特徴抽出回路１４に供給する。特徴抽出回路１４は、領域切り出し回路１３の出力に基づいて劣化した画像信号の特徴を抽出し、抽出した特徴をクラスコード発生回路１５に供給する。クラスコード発生回路１５は、特徴抽出回路１４の出力に基づいてクラスコードを発生させ、発生させたクラスコードを正規方程式加算回路１６に供給する。
【００１４】
一方、領域切り出し回路１１は、教師信号から所定の画素領域を切り出し、切り出した領域を正規方程式加算回路１６に供給する。正規方程式加算回路１６は、領域切り出し回路１１の出力、領域切り出し回路１２の出力、およびクラスコード発生回路１５の出力に基づいて所定の計算処理を行って、クラス毎に予測係数を算出するための正規方程式に係るデータを生成し、生成したデータを予測係数決定回路１７に供給する。予測係数決定回路１７は、供給されるデータに基づいて正規方程式を解く計算処理を行って予測係数を算出し、算出した予測係数をメモリ１８に供給する。メモリ１８は、供給される予測係数を記憶する。
【００１５】
領域切り出し回路２３、特徴抽出回路２４、クラスコード発生回路２５，正規方程式加算回路２６、予測係数決定回路２７、およびメモリ２８は、それぞれ、上述した、領域切り出し回路１３、特徴抽出回路１４、クラスコード発生回路１５、正規方程式加算回路１６、予測係数決定回路１７、およびメモリ１８と同様なものである。但し、領域切り出し回路２３は、図１Ｂに示したような間引き間隔２のクラスタップ配置の下で領域切り出しを行う。
【００１６】
また、領域切り出し回路３３、特徴抽出回路３４、クラスコード発生回路３５，正規方程式加算回路３６、予測係数決定回路３７、およびメモリ３８は、それぞれ、上述した、領域切り出し回路１３、特徴抽出回路１４、クラスコード発生回路１５、正規方程式加算回路１６、予測係数決定回路１７、およびメモリ１８と同様なものである。但し、領域切り出し回路３３は、図１Ｃに示したような間引き間隔３のクラスタップ配置の下で領域切り出しを行う。
【００１７】
ここで、予測係数の算出に係る演算についてより詳細に説明する。図４を参照して後述するように、クラス分類適応処理による予測画像信号は、入力画像信号の所定の画素位置から抽出される予測タップと、学習によって得られた予測係数とに基づいて、以下の式（１）に従って順次予測生成される画素値ｙからなる。
【００１８】
ｙ＝ｗ₁×ｘ₁＋ｗ₂×ｘ₂＋‥‥＋ｗ_n×ｘ_n （１）
ここで、ｘ₁，‥‥，ｘ_nが各予測タップであり、ｗ₁，‥‥，ｗ_nが各予測係数である。
【００１９】
正規方程式加算回路１６、２６、３６は、それぞれ、領域切り出し回路１２、２２、２３から供給される予測タップ、およびクラスコード発生回路１５、２５、３５から供給されるクラスコード、並びに教師信号に基づいて加算処理を行うことにより、予測係数ｗ₁，‥‥，ｗ_nを解とする正規方程式を解くために必要なデータを算出する。そして、予測係数決定回路１７、２７、３７は、供給されるデータに基づいて正規方程式を解くための計算処理を行って予測係数を算出する。
【００２０】
正規方程式について説明する。上述の式（１）において、学習前は予測係数ｗ₁，‥‥，ｗ_nが未定係数である。学習は、クラス毎に複数の教師信号を入力することによって行う。教師信号の種類数をｍと表記する場合、式（１）から、以下の式（２）が設定される。
【００２１】
ｙ_k＝ｗ₁×ｘ_k1＋ｗ₂×ｘ_k2＋‥‥＋ｗ_n×ｘ_kn （２）
（ｋ＝１，２，‥‥，ｍ）
ｍ＞ｎの場合、予測係数ｗ₁，‥‥，ｗ_nは一意に決まらないので、誤差ベクトルｅの要素ｅ_kを以下の式（３）で定義して、式（４）によって定義される誤差ベクトルｅを最小とするように予測係数を定めるようにする。すなわち、いわゆる最小２乗法によって予測係数を一意に定める。
【００２２】
ｅ_k＝ｙ_k−｛ｗ₁×ｘ_k1＋ｗ₂×ｘ_k2＋‥‥＋ｗ_n×ｘ_kn｝（３）
（ｋ＝１，２，‥‥ｍ）
【００２３】
【数１】

【００２４】
式（４）のｅ²を最小とする予測係数を求めるための実際的な計算方法としては、ｅ²を予測係数ｗ_i(i=1,2‥‥）で偏微分し（式（５））、ｉの各値について偏微分値が０となるように各予測係数ｗ_iを定めれば良い。
【００２５】
【数２】

【００２６】
式（５）から各予測係数ｗ_iを定める具体的な手順について説明する。式（６）、（７）のようにＸ_ji，Ｙ_iを定義すると、式（５）は、式（８）の行列式の形に書くことができる。
【００２７】
【数３】

【００２８】
【数４】

【００２９】
【数５】

【００３０】
式（８）が一般に正規方程式と呼ばれるものである。予測係数決定回路１７、２７、３７は、正規方程式データに基づいて、掃き出し法等の一般的な行列解法に従って正規方程式を解くための計算処理を行って予測係数ｗ_iを算出する。
【００３１】
なお、図２には、予測タップの間引き間隔を固定してクラスタップの間引き間隔を変える構成を示したが、予測タップの間引き間隔を、クラスタップの間引き間隔と共に変える構成としても良い。そのような構成について図３に示す。図３において、各構成要素は、図２中で同一の符号を付したものと同様なものである。但し、領域切り出し回路１２、２２、３２は、それぞれ、間引き間隔１、２、３のタップ構造を有する予測タップを切り出す。
【００３２】
次に、この発明の一実施形態において、上述したようにして得られた予測係数を使用して、各間引き間隔の下で配置されるクラスタップ毎の予測画像信号を生成するための構成について説明する。そのような構成の一例を図４に示す。入力画像信号が領域切り出し回路１０１、１０２、２０１、２０２、３０１、３０２に供給される。ここで、入力画像信号には、伝送路等において帯域制限によってぼけが生じていることがあり、そのままでは表示画像がぼけたものとなる場合がある。領域切り出し回路１０１、２０１、３０１は、入力画像信号から予測タップとして使用される画素領域を切り出し、切り出した領域を、推定演算部１０６、２０６、３０６にそれぞれ供給する。
【００３３】
一方、領域切り出し回路１０２、２０２、３０２は、供給される画像信号から所定の画素領域をクラスタップとして切り出し、切り出した領域を、特徴抽出回路１０３、２０３、３０３にそれぞれ供給する。特徴抽出回路１０３、２０３、３０３は、それぞれ、領域切り出し回路１０２、２０２、３０２の出力に基づいてぼけ画像の特徴を抽出し、抽出した特徴をクラスコード発生回路１０４、２０４、３０４にそれぞれ供給する。
【００３４】
クラスコード発生回路１０４、２０４、３０４は、それぞれ、特徴抽出回路１０３、２０３、３０３の出力に基づいてクラスコードを発生させ、発生させたクラスコードを、ＲＯＭ１０５、２０５、３０５にそれぞれ供給する。ＲＯＭ１０５、２０５、３０５は、図２等を参照して上述したようにして算出された予測係数を記憶している。すなわち、図２または図３中のメモリ１８、２８、３８の記憶内容がＲＯＭ１０５、２０５、３０５にそれぞれ予めロードされている。
【００３５】
ＲＯＭ１０５、２０５、３０５は、それぞれ、クラスコード発生回路１０４、２０４、３０４の出力に対応する予測係数を出力する。出力される予測係数は、それぞれ、推定演算部１０６、２０６、３０６に供給される。推定演算部１０６、は、ＲＯＭ１０５から供給される予測係数と、領域切り出し回路１０１から供給される予測タップとして使用される画素値との線型一次結合（式（１）参照）を計算することにより、各画素値ｙを推定し、推定される画素値ｙの総体としての予測画像信号を生成する。
【００３６】
このようにして生成される予測画像においては、伝送路等における入力画像信号の劣化に起因する画像のぼけが解消若しくは軽減されている。なお、図４に示した構成では、領域切り出し回路１０１、２０１、３０１における間引き間隔は一定とされているが、図３に示した構成に対応して、領域切り出し回路１０１、２０１、３０１がそれぞれ、間引き間隔１、２、３のタップ構造を有する予測タップを切り出すようにしても良い。
【００３７】
この発明は、クラス分類適応処理処理において上述したような例えば３種類等の複数個の間引き間隔のクラスタップ構造を用いることによって得られる例えば３種類等の複数個の予測値の内で最適な予測値を決定し、決定結果に基づいて最終的な出力画像信号を作成するようにしたものである。
【００３８】
この発明の一実施形態における画像処理装置の一例を図５に示す。入力画像信号は、予測画像生成部５１に供給される。予測画像生成部５１は間引き間隔毎の予測画像を生成する構成であり、予測画像生成部５１としては、図４を参照して上述した構成等を使用することができる。かかる構成を使用する場合には、予測画像生成部５１は、例えば３種類の間引き間隔に対応する３個の予測画像信号を生成し、これら３個の予測画像信号を予測値選択回路５３に供給する。但し、間引き間隔の種類、およびそれらに対応して生成される予測画像信号の個数は、３に限定されるものでは無い。予測値選択回路５３は、供給される予測値の内から各画素位置毎に最大の予測値および最小の予測値を選択し、選択した予測値を出力値決定回路５４に供給する。
【００３９】
入力画像信号が領域切り出し回路５５に供給される。領域切り出し回路５５は、入力画像信号から所定の画素領域をクラスタップとして切り出し、切り出した画素領域を特徴抽出回路５６に供給する。特徴抽出回路５６は、領域切り出し回路５５の出力に基づいてぼけ画像の特徴を抽出し、抽出した特徴をクラスコード発生回路５７に供給する。クラスコード発生回路５７は、特徴抽出回路５６の出力に基づいてクラスコードを発生させ、発生させたクラスコードを統計的データＲＯＭ５８に供給する。
【００４０】
統計的データＲＯＭ５８は、後述するような統計的データをクラスコード毎に記憶しており、クラスコード発生回路５７から供給されるクラスコードに対応する統計的データを予測値決定回路５４に供給する。予測値決定回路５４は、供給される統計的データを参照して、予測値選択回路５３から供給される各画素位置毎の最大の予測値および最小の予測値の内で最適なものを決定し、決定結果に基づいて出力画像信号を作成する。この際の処理について図６を参照して具体的に説明する。ここで、原画像の信号波形の一例を黒丸で示す。また、この一例に対応する位置における、予測画像生成部５１によって生成される予測画像信号の信号波形の一例を３個の白丸で示す。最適な予測値としては、黒丸に最も近い白丸が選択されれば良い。このためには、信号波形のピークが上に凸の場合に３個の予測値の内の最大値を選択し、信号波形のピークが下に凸の場合に３個の予測値の内の最小値を選択するようにすれば良い。また、ピーク以外の部分では、原画像信号と予測画像信号との間のレベル差が小さいので、最適な予測値としては、３個の予測値の内の何れを選んでも良い。以上の状況を考慮して、この発明の一実施形態では、予測画像生成部５１によって生成される３個の予測値の内から最大および最小の予測値を予測値選択回路５３が選択し、最大および最小の予測値の内から、後述するような統計的データを参照して、予測値決定回路５４が出力画像信号として最適なものを選択するようになされている。
【００４１】
また、特徴抽出回路５６が出力する、クラス分類のための特徴としては、例えば、一次微分の符号と大きさ、二次微分の符号を用いることができる。例えば、注目画素を中心とした水平５タップからなる、図７に示すようなクラスタップ構造の一例を用いる場合に、各々の画素位置について、一次微分Ｄの符号（すなわちＤの正負）についての１ビット、一次微分Ｄの大きさに関する１ビット、（すなわち｜Ｄ｜の値を１ビットＡＤＲＣで量子化した量子化値）、二次微分Ｅの符号（すなわちＥの正負）に関する１ビットの計３ビットを割り当てる。従って、５画素に対するビット数は３×５＝１５ビットとなり、総クラス数は２¹⁵＝３２７６８クラスとなる。ここで使用される水平５タップについて間引きを行うようにしても良い。一次微分Ｄ，二次微分Ｅはそれぞれ、以下の式（９）、式（１０）によって計算される。
【００４２】
この発明の一実施形態において、ＡＤＲＣ回路３は、領域切り出し回路２によって分離されたそれぞれ５画素のＳＤデータを各２ビットに圧縮するものとする。以下、圧縮されたＳＤデータをそれぞれｑ₁〜ｑ₅と表記する。これらのパターン圧縮データがクラスコード発生回路６に供給される。
【００４３】
Ｄ（ｉ，ｊ）＝ｆ（ｉ，＋ｊ）−ｆ（ｉ，ｊ−１）（９）
Ｅ（ｉ，ｊ）
＝ｆ（ｉ，ｊ＋１）＋ｆ（ｉ，ｊ−１）−２×ｆ（ｉ，ｊ）（１０）
ここで、ｉ，ｊは、画素位置を２次元的に表す座標であり、ｆ（ｉ，ｊ）は、画素位置における画素値である。ここでは、クラスタップを２次元に配置する場合を例として説明したが、この発明は、クラスタップが３次元（時空間）内でに配置される場合等にも、適用することができる。なお、１ビットＡＤＲＣ（Adaptive Dynamic Range Coding)は、何らかのデータの時間的または空間的な変化パターンを少ないビット数で表現するための処理である。｜Ｄ｜のダイナミックレンジをＤＲ，ビット割当をｎ，各画素位置における｜Ｄ｜のデータレベルをＬ，再量子化コードをＱとして、以下の式（１１）により、｜Ｄ｜の最大値ｍａｘと最小値ｍｉｎとの間を指定されたビット長で均等に分割して再量子化を行う。
【００４４】
ＤＲ＝ＭＡＸ−ＭＩＮ＋１
Ｑ＝｛（Ｌ−ＭＩＮ＋０．５）×２／ＤＲ｝（１１）
但し、｛｝は切り捨て処理を意味する。
【００４５】
また、予測値決定回路５４が２種類の統計的データを参照して動作する構成とすれば、最適な予測値をより高精度に決定することができる。このような構成の一例を図８に示す。図８中で、図５中の各構成要素と同様な構成要素には同一の符号を付した。かかる構成の一例は、統計的データの出力に係る処理系列として、領域切り出し回路５５、特徴抽出回路５６、クラスコード発生回路５７，および統計的データＲＯＭ５８を含む第１の処理系列と、領域切り出し回路１５５、特徴抽出回路１５６、クラスコード発生回路１５７，および統計的データＲＯＭ１５８を含む第２の処理系列とを備える。
【００４６】
ここで、第１の系列が行う処理によって統計的データＲＯＭ５８から出力される第１の統計的データを参照して予測値決定回路５４が動作した結果として曖昧な判定が行われる場合には、第２の系列が行う処理によって統計的データＲＯＭ１５８から出力される第２の統計的データを参照して予測値決定回路５４が動作するように制御する等の制御方法により、最適な予測値を決定する精度を向上させることができる。なお、クラスコードを発生させる構成を３系列以上備え、これらが発生する３種類以上のクラスコードの各々に対応して統計的データを記憶している統計的データＲＯＭも３個以上備えることにより、予測値決定回路５４が３種類以上の統計的データを参照して動作するようにし、最適な予測値を決定する精度をさらに向上させることも可能である。
【００４７】
次に、予測値決定回路５４の動作において参照される統計的データの生成について図９を参照して説明する。教師信号としての入力画像信号がＬＰＦ６１、最適予測値選択回路６４、および残差計算回路６５とに供給される。ＬＰＦ６１は、供給される画像信号にＬＰＦ処理を施し、劣化した画像信号を生成する。ＬＰＦ６１の出力は、予測画像生成部６２および領域切り出し回路６６に供給される。
【００４８】
予測画像生成部６２は間引き間隔毎の予測画像を生成する構成であり、予測画像生成部６２としては、図５、図８中の予測画素生成部５１と同様の構成を有するものが使用される。予測画像生成部６２の出力である、例えば３個の予測画像信号が第１の予測値選択回路６３に供給される。予測値選択回路６３は、予測画像生成部６２から供給される予測値の内から各画素位置毎に最大の予測値ＭＡＸおよび最大の予測値ＭＩＮを選択し、最適予測値出力回路６４に供給する。
【００４９】
最適予測値出力回路６４は、予測値選択回路６３の出力の内から教師信号との残差すなわち信号レベル差の絶対値が小さいものを選択し、選択した予測値（すなわちＭＡＸ或いはＭＩＮ）を最適な予測値として残差計算回路６５に供給する。また、最適予測値出力回路６４は、選択した予測値がＭＡＸ、ＭＩＮの何れであるかを示す信号ｄカウント回路６９に供給する。残差計算回路６５は、最適予測値選択回路６４から供給される最適な予測値と、教師信号とに基づいて残差を計算し、計算値をカウント回路６９に供給する。
【００５０】
一方、領域切り出し回路６６は、ＬＰＦ６１の出力から所定の画素領域をクラスタップとして切り出し、切り出した画素領域を特徴抽出回路６７に供給する。特徴抽出回路６７は、領域切り出し回路６６の出力に基づいて、図５中の特徴抽出回路５６と同様な動作を行うことにより、画像の特徴を抽出する。特徴抽出回路６７によって抽出される特徴がクラスコード発生回路６８に供給される。クラスコード発生回路６８は、特徴抽出回路６７の出力に基づいてクラスコードを発生させ、発生させたクラスコードをカウント回路６９に供給する。領域切り出し回路６６、特徴抽出回路６７、およびクラスコード発生回路６８は、それぞれ、図５中の領域切り出し回路５５、特徴抽出回路５６、およびクラスコード発生回路５７と同様なものとされる。
【００５１】
カウント回路６９は、残差計算回路６５から供給される残差が所定のしきい値より大きい画素のみを、クラスコード発生回路６８から供給されるクラスコード、および上述の信号ｄによって伝えられる、当該画素が最大値／最小値の何れとして最適予測値選択回路６４にて選択されたものであるかに情報に応じてカウントすることにより、統計的データを生成する。生成される統計的データは、メモリ７０に供給され、記憶される。ここで、ある画素がカウントの対象とされるか否かは、例えば、当該画素に係る残差が所定のしきい値よりも大きくなるか否かに応じて判定することができる。統計的データの一例を図１０に示す。図１０では、クラスコードがクラス１を示す場合において、最適予測値選択回路６４によって最大値が選択された場合の画素であって、当該画素と教師信号との残差がしきい値よりも大きいもののカウント数が１０であり、最適予測値出力回路６４によって最大値が選択された場合の画素であって、当該画素と教師信号との残差の絶対値がしきい値より大きいもののカウント数が１であることが示されている。
【００５２】
また、クラスコードがクラス２を示す場合に、最大値／最小値として選択された画素であって、当該画素と教師信号との残差が大きいもののカウント数が２／１５であることが示されている。同様に、クラスコードがクラス３を示す場合に、最大値／最小値として選択された画素であって、当該画素と教師信号との残差が大きいもののカウント数が５／１０であることが示されている。
【００５３】
このように、残差による制限、すなわち残差がある程度以上大きい画素のみをカウントすることで統計的データを得るようにすることにより、間引き間隔等の条件との関連で画質の向上に寄与する度合いが小さい部分を排除し、画質の向上に寄与する度合いが大きい部分のみが統計的データに反映するようになされる。
【００５４】
また、最適予測値と教師信号との画素毎の残差以外の量に注目して、統計データに反映する部分を制限するようにしても良い。例えば、間引き間隔毎の複数の予測値の間における、ダイナミックレンジＤＲ、または微分値の大きさを注目して、統計データに反映する部分を制限するようにした構成も可能である。この際にどのような制限を課すかによって、複数種類の統計的データを得ることができる。このようにして、図８等を参照して上述した構成において用いられる複数種類の統計的データを生成することが可能となる。また、領域切り出し回路５５、と領域切り出し回路１５５、および特徴抽出回路５６と特徴抽出回路１５６における処理を変えても良い。
【００５５】
メモリ７０に記憶された統計的データは、図５、図８中の統計的データＲＯＭ５８にロードされる。図５および図８中の予測値決定回路５４は、予測値選択回路５３から供給される最大の画素値および最小の画素値の内で、統計的データ中でカウント値が多い方を好適な予測画素値として決定するようになされる。但し、統計的データ中で最大の画素値および最小の画素値に対するカウント値が互いに拮抗している場合には、当該統計的データを参照して好適な予測画素値を決定すると、決定結果の信頼性が低いと考えられる。このような場合に決定結果が誤った場合には、大幅な画質劣化が生じるおそれがある。
【００５６】
そこで、統計的データ中で最大の画素値および最小の画素値に対するカウント値が互いに拮抗している場合に以下のような処理を行うようにすることにより、予測値決定回路５４が行う好適な予測画素値の決定についての信頼性を担保することができる。▲１▼図８に示したような構成において、当該統計的データ以外の統計的データを予測値決定回路５４の動作において参照する。▲２▼最大の画素値と最小の画素値との平均値を予測値決定回路５４から出力させる。ここで、▲１▼、▲２▼は何れも例であり、これらに限定されるものでは無い。
【００５７】
なお、統計的データ中で最大の画素値および最小の画素値に対するカウント値が互いに拮抗している場合は、例えば以下の式（１２）が成り立つ場合として判定することができる。
【００５８】
ａ／（ａ−ｂ）＜threshould （１２）
ここで、ａは多い方の度数を表し、ｂは少ない方の度数を表す。また、threshouldは、所定のしきい値である。
【００５９】
この発明は、上述したこの発明の一実施形態に限定されるものでは無く、この発明の主旨を逸脱しない範囲内で様々な変形や応用が可能である。
【００６０】
【発明の効果】
上述したように、この発明は、互いに異なる間引き間隔を有する複数種類のクラスタップ構造および／または予測タップ構造に基づくクラス分類適応処理の結果の内から、統計的データを参照して最適なものを選択し、選択した画像信号を出力画像信号とするものである。
【００６１】
このため、入力画像信号内の各画素位置について、当該画素位置における画像劣化の度合い、画像中の特徴等に対してより適合性の高い間引き間隔の下でのクラス分類適応処理結果を使用して、出力画像信号を生成することができる。
【００６２】
特に、この発明では、画素毎に最適な間引き間隔を選択できるので、画像信号全体に対して一定の間引き間隔の下でクラス分類適応処理を行う場合に比して、より大きな画質改善を実現することができる。
【００６３】
また、互いに異なる間引き間隔の下でのクラス分類適応処理の結果の内から、最大値、最小値を検出し、検出した最大値、最小値の内から最適なものを選択する処理を行うことにより、互いに異なる間引き間隔の下でのクラス分類適応処理の結果の全部（例えば３個の予測画像信号）から最適なものを選択する場合に比べ、特に信号波形のピーク付近において、容易に、且つ正確に最適な予測画像信号を選択することができる。
【００６４】
さらに、統計的データを求める処理において、画像信号の特徴に関連する所定の条件が加味されるようにすることにより、画質改善のためにより的確な統計的データを得ることができる。
【００６５】
また、統計的データを複数種類使用する構成とすれば、より大きな画質改善を実現することができる。
【図面の簡単な説明】
【図１】各間引き間隔におけるクラスタップ構造の一例を示す略線図である。
【図２】クラス分類適応処理における学習に係る構成の一例を示すブロック図である。
【図３】クラス分類適応処理における学習に係る構成の他の例を示すブロック図である。
【図４】クラス分類適応処理における予測推定に係る構成の一例を示すブロック図である。
【図５】この発明の一実施形態における、予測推定に係る構成の一例を示すブロック図である。
【図６】原画像信号の信号波形と、予測画像信号の信号波形との一例を示す略線図である。
【図７】統計的データを参照する際のクラス分類の一例について説明するための略線図である。
【図８】この発明の一実施形態における構成の他の例を示すブロック図である。
【図９】この発明の一実施形態における、統計的データを得るための構成の一例を示すブロック図である。
【図１０】統計的データの一例を示す略線図である。
【符号の説明】
５１・・・予測画像出力部、５３・・・予測値選択回路、５４・・・出力値決定回路、５８・・・統計的データＲＯＭ[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image information conversion device, an image information conversion method, a learning device, and a learning method for improving blurring of an input image signal.
[0002]
[Prior art]
In recent years, processing for improving image quality by improving blur using class classification adaptive processing has been researched and developed. The class classification adaptive process performs class classification according to the signal level distribution of the input image signal, stores the prediction coefficient value obtained by learning in advance for each class in a predetermined storage unit, and weights using the prediction coefficient value This is a process of outputting an optimum estimated value as the pixel value of the target pixel by the addition formula.
[0003]
In this case, in order to obtain data necessary for classifying the target pixel, data necessary for class taps arranged around the target pixel and calculation for estimating the pixel value of the target pixel If the prediction taps arranged around the pixel of interest are adaptively arranged, i.e., at the optimum thinning interval for each pixel of interest, control is not performed on such a thinning interval. Compared to this, the image quality can be greatly improved. However, such control has not been realized.
[0004]
[Problems to be solved by the invention]
Accordingly, an object of the present invention is to provide an image information conversion device, an image information conversion method, a learning device, and a learning method capable of improving the degree of image quality improvement when performing image quality improvement using class classification adaptation processing. It is to provide.
[0005]
[Means for Solving the Problems]
  The invention according to claim 1 is a prediction image signal generation unit that generates prediction pixel values from images obtained by thinning the input image signal at a plurality of different thinning intervals, respectively.
  Optimal pixel value output means for selectively outputting one of a maximum value and a minimum value of a plurality of prediction pixel values output from the prediction image signal generation means based on the characteristics of the input image signal.,
  The optimum pixel value output means is
  Statistical data class tap extraction means for outputting pixel values at predetermined positions specified around the pixel of interest from the input image signal as statistical data class taps;
  A statistical data class code generating means for generating a statistical data class code based on the output of the statistical data class tap extracting means;
  Statistical data storage means for storing statistical data in advance and outputting statistical data corresponding to the output of the statistical data class code generating means;
  Output value determining means for outputting an optimum value from the maximum value and the minimum value based on the output of the statistical data storage means;This is an image information conversion device.
[0006]
  Claim4According to the invention, a predicted image signal generation step for generating a predicted pixel value from an image obtained by thinning an input image signal at a plurality of different thinning intervals;
  One of the maximum value and the minimum value among the plurality of prediction pixel values generated by the prediction image signal generation step is selectively output based on the characteristics of the input image signal.Optimal pixel value outputWith steps,
  The optimal pixel value output step is
  A statistical data class tap extraction step for outputting a pixel value at a predetermined position specified around the pixel of interest from the input image signal as a statistical data class tap;
  A class code generation step for statistical data that generates a class code for statistical data based on the output of the class tap extraction step for statistical data;
  A statistical data output step of outputting statistical data corresponding to the statistical data class code generated by the statistical data class code generation step from the statistical data stored in advance;
  Based on the output of the statistical data output step, an output value determination step for outputting an optimum value from the maximum value and the minimum value is provided.This is an image information conversion method characterized by the above.
[0007]
  Claim5According to the present invention, signal degradation processing means for degrading an input image signal,
  Predicted image signal generating means for generating predicted pixel values, respectively, from images obtained by thinning images output from the signal degradation processing means at different thinning intervals;
  An optimum pixel value output means for selectively outputting one of a maximum value and a minimum value among a plurality of prediction pixel values output from the prediction image signal generation means with reference to the input image signal;
  Represents the number of pixels that satisfy a predetermined condition among the outputs of the optimum pixel value output means, classifies whether the pixel is the maximum value or the minimum value, and the characteristics of the image output from the signal degradation processing means And a counting means for counting corresponding to the output of the means for generating information.
[0008]
  Claim11The invention includes a signal degradation processing step for degrading an input image signal,
  A predicted image signal generating step for generating predicted pixel values, respectively, from images obtained by thinning out images generated by the signal degradation processing step at different types of thinning intervals;
  An optimum pixel value output step for selectively outputting one of a maximum value and a minimum value among a plurality of prediction pixel values generated by the prediction image signal generation step with reference to the input image signal;
  Expresses the number of pixels that satisfy a predetermined condition among the results of the optimal pixel value output step, classifies whether the pixel is the maximum value or the minimum value, and the characteristics of the image generated by the signal degradation processing step And a step of counting corresponding to the output of the step of generating the information to be performed.
[0009]
According to the invention as described above, the optimum one among the results of the class classification adaptive processing based on a plurality of types of class tap structures having different thinning intervals is selected as the output image signal.
[0010]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described. The class classification adaptive process performs class classification according to the feature data of the input image signal such as the signal level distribution of the input image signal, and stores the prediction coefficient value obtained by learning in advance for each class in a predetermined storage unit. In this process, the optimum estimated value is output as the pixel value of the target pixel by the weighted addition formula using the prediction coefficient value.
[0011]
Class taps are arranged on the input image signal in order to grasp the signal level distribution of the input image signal and perform classification. The class tap is a pixel value extracted from a predetermined position of the input image signal as data for calculating a feature expressing the signal level distribution of the input image signal. For example, the class tap is arranged on the frame including the pixel of interest and on the frames before and after it. An example of the class tap structure on the frame including the target pixel is shown in FIG. Here, in FIG. 1A, class taps are arranged at pixel positions adjacent to the thinning interval 1, that is, vertically / horizontally with respect to the target pixel. That is, when the thinning interval is 1, thinning is not performed. In FIG. 1B, the class tap is arranged at the second pixel position vertically / horizontally with respect to the thinning interval 2, that is, the target pixel. Further, in FIG. 1C, a class tap is arranged at a thinning interval 3, that is, at a third pixel position vertically / horizontally with respect to the target pixel. Similarly, a class tap arrangement in which the thinning interval is further widened, such as the thinning intervals 4 and 5 can be used. In a blurred image, it is different in which decimation interval is optimal for class classification according to the extent of the blur range.
[0012]
Prior to the description of an embodiment of the present invention, a class tap arrangement set under a plurality of types of thinning amounts has already been proposed for improving the image quality of blurred images by class classification adaptive processing. The configuration will be described. First, FIG. 2 shows an example of a configuration related to learning, that is, processing for obtaining a prediction coefficient. A predetermined input image signal (referred to as a teacher signal) that does not include blur is supplied to an LPF (Low Pass Filter) circuit 10 and an area extraction circuit 11. The LPF circuit 10 performs an LPF process on the input image signal to generate a deteriorated (blurred) image signal. The deteriorated image signal is converted into the

region extraction circuits

12 and 13, the

region extraction circuits

22 and 23, and the region This is supplied to the

clipping circuits

32 and 33.
[0013]
The region cutout circuit 12 cuts out a predetermined range of pixel regions from the degraded image signal as a prediction tap, and supplies the cutout region to the normal equation addition circuit 16. Further, the region cutout circuit 13 cuts out a predetermined range of pixel regions as class taps under the class tap arrangement with the thinning interval 1 as shown in FIG. 1A and supplies the cut out region to the feature extraction circuit 14. The feature extraction circuit 14 extracts the degraded feature of the image signal based on the output of the region cutout circuit 13 and supplies the extracted feature to the class code generation circuit 15. The class code generation circuit 15 generates a class code based on the output of the feature extraction circuit 14 and supplies the generated class code to the normal equation addition circuit 16.
[0014]
On the other hand, the region cutout circuit 11 cuts out a predetermined pixel region from the teacher signal and supplies the cutout region to the normal equation addition circuit 16. The normal equation adding circuit 16 performs a predetermined calculation process based on the output of the area cutout circuit 11, the output of the area cutout circuit 12, and the output of the class code generation circuit 15, and calculates a prediction coefficient for each class. Data related to the normal equation is generated, and the generated data is supplied to the prediction coefficient determination circuit 17. The prediction coefficient determination circuit 17 calculates a prediction coefficient by performing a calculation process for solving a normal equation based on the supplied data, and supplies the calculated prediction coefficient to the memory 18. The memory 18 stores the supplied prediction coefficient.
[0015]
The region segmentation circuit 23, feature extraction circuit 24, class code generation circuit 25, normal equation addition circuit 26, prediction coefficient determination circuit 27, and memory 28 are the region segmentation circuit 13, feature extraction circuit 14, class code described above, respectively. This is similar to the generation circuit 15, normal equation addition circuit 16, prediction coefficient determination circuit 17, and memory 18. However, the region cutout circuit 23 performs region cutout under the class tap arrangement with the thinning interval 2 as shown in FIG. 1B.
[0016]
In addition, the region extraction circuit 33, the feature extraction circuit 34, the class code generation circuit 35, the normal equation addition circuit 36, the prediction coefficient determination circuit 37, and the memory 38 are respectively the region extraction circuit 13, the feature extraction circuit 14, The class code generation circuit 15, the normal equation addition circuit 16, the prediction coefficient determination circuit 17, and the memory 18 are the same. However, the region cutout circuit 33 performs region cutout under a class tap arrangement with a thinning interval 3 as shown in FIG. 1C.
[0017]
Here, the calculation related to the calculation of the prediction coefficient will be described in more detail. As will be described later with reference to FIG. 4, the predicted image signal by the class classification adaptive processing is based on a prediction tap extracted from a predetermined pixel position of the input image signal and a prediction coefficient obtained by learning. The pixel value y is sequentially predicted and generated according to the equation (1).
[0018]
y = w₁X₁+ W₂X₂+ ... + w_nX_n (1)
Where x₁, ..., x_nIs each prediction tap, w₁, ..., w_nAre each prediction coefficient.
[0019]
The normal

equation addition circuits

16, 26, and 36 are based on the prediction tap supplied from the

region cutout circuits

12, 22, and 23, the class code supplied from the class

code generation circuits

15, 25, and 35, and the teacher signal, respectively. The prediction coefficient w₁, ..., w_nData necessary for solving a normal equation with the solution is calculated. Then, the prediction

coefficient determination circuits

17, 27, and 37 calculate the prediction coefficient by performing a calculation process for solving the normal equation based on the supplied data.
[0020]
The normal equation will be described. In the above equation (1), before learning, the prediction coefficient w₁, ..., w_nIs an undetermined coefficient. Learning is performed by inputting a plurality of teacher signals for each class. When the number of types of teacher signals is expressed as m, the following equation (2) is set from equation (1).
[0021]
y_k= W₁X_k1+ W₂X_k2+ ... + w_nX_kn (2)
(K = 1, 2,..., M)
If m> n, prediction coefficient w₁, ..., w_nIs not uniquely determined, the element e of the error vector e_kIs defined by the following equation (3), and the prediction coefficient is determined so as to minimize the error vector e defined by equation (4). That is, the prediction coefficient is uniquely determined by a so-called least square method.
[0022]
e_k= Y_k-{W₁X_k1+ W₂X_k2+ ... + w_nX_kn} (3)
(K = 1, 2, ... m)
[0023]
[Expression 1]

[0024]
E in equation (4)²As a practical calculation method for obtaining the prediction coefficient that minimizes²Prediction coefficient w_i(i = 1, 2...) is partially differentiated (formula (5)), and each prediction coefficient w is set so that the partial differential value becomes 0 for each value of i._iShould be determined.
[0025]
[Expression 2]

[0026]
From equation (5), each prediction coefficient w_iA specific procedure for determining the above will be described. X as in equations (6) and (7)_ji, Y_i(5) can be written in the form of the determinant of equation (8).
[0027]
[Equation 3]

[0028]
[Expression 4]

[0029]
[Equation 5]

[0030]
Equation (8) is generally called a normal equation. The prediction

coefficient determination circuits

17, 27, and 37 perform a calculation process for solving the normal equation based on the normal equation data according to a general matrix solving method such as a sweep-out method, and perform the prediction coefficient w_iIs calculated.
[0031]
FIG. 2 shows a configuration in which the prediction tap thinning interval is fixed and the class tap thinning interval is changed, but the prediction tap thinning interval may be changed together with the class tap thinning interval. Such a configuration is shown in FIG. In FIG. 3, each component is the same as that shown in FIG. However, the

region cutout circuits

12, 22, and 32 cut out prediction taps having tap structures with thinning

intervals

1, 2, and 3, respectively.
[0032]
Next, in one embodiment of the present invention, a configuration for generating a predicted image signal for each class tap arranged under each thinning interval using the prediction coefficient obtained as described above will be described. To do. An example of such a configuration is shown in FIG. An input image signal is supplied to the

area cutout circuits

101, 102, 201, 202, 301, 302. Here, the input image signal may be blurred due to band limitation in the transmission path or the like, and the display image may be blurred as it is. The

region cutout circuits

101, 201, and 301 cut out pixel regions used as prediction taps from the input image signal, and supply the cutout regions to the

estimation calculation units

106, 206, and 306, respectively.
[0033]
On the other hand, the

area extraction circuits

102, 202, and 302 extract predetermined pixel areas from the supplied image signals as class taps, and supply the extracted areas to the

feature extraction circuits

103, 203, and 303, respectively. The

feature extraction circuits

103, 203, and 303 extract the features of the blurred image based on the outputs of the

region cutout circuits

102, 202, and 302, and supply the extracted features to the class

code generation circuits

104, 204, and 304, respectively. .
[0034]
The class

code generation circuits

104, 204, and 304 generate class codes based on the outputs of the

feature extraction circuits

103, 203, and 303, and supply the generated class codes to the

ROMs

105, 205, and 305, respectively. The

ROMs

105, 205, and 305 store prediction coefficients calculated as described above with reference to FIG. That is, the storage contents of the

memories

18, 28, and 38 in FIG. 2 or FIG. 3 are loaded in advance in the

ROMs

105, 205, and 305, respectively.
[0035]
The

ROMs

105, 205, and 305 output prediction coefficients corresponding to the outputs of the class

code generation circuits

104, 204, and 304, respectively. The output prediction coefficients are supplied to the

estimation calculation units

106, 206, and 306, respectively. The estimation calculation unit 106 calculates a linear linear combination (see Expression (1)) between a prediction coefficient supplied from the ROM 105 and a pixel value used as a prediction tap supplied from the region extraction circuit 101. Each pixel value y is estimated, and a predicted image signal as a whole of the estimated pixel value y is generated.
[0036]
In the predicted image generated in this way, blurring of an image due to deterioration of an input image signal in a transmission path or the like is eliminated or reduced. In the configuration illustrated in FIG. 4, the thinning intervals in the

region extraction circuits

101, 201, and 301 are constant, but the

region extraction circuits

101, 201, and 301 correspond to the configuration illustrated in FIG. 3, respectively. A prediction tap having a tap structure with thinning

intervals

1, 2, and 3 may be cut out.
[0037]
The present invention provides an optimum prediction among a plurality of prediction values such as three types obtained by using a class tap structure with a plurality of thinning intervals such as three types as described above in the class classification adaptive processing. A value is determined, and a final output image signal is created based on the determination result.
[0038]
An example of an image processing apparatus according to an embodiment of the present invention is shown in FIG. The input image signal is supplied to the predicted image generation unit 51. The predicted image generation unit 51 is configured to generate a predicted image for each thinning interval, and the configuration described above with reference to FIG. 4 can be used as the predicted image generation unit 51. When such a configuration is used, the predicted image generation unit 51 generates, for example, three predicted image signals corresponding to three types of thinning intervals, and supplies these three predicted image signals to the predicted value selection circuit 53. To do. However, the types of thinning intervals and the number of predicted image signals generated corresponding to them are not limited to three. The predicted value selection circuit 53 selects the maximum predicted value and the minimum predicted value for each pixel position from the supplied predicted values, and supplies the selected predicted value to the output value determining circuit 54.
[0039]
The input image signal is supplied to the area cutout circuit 55. The region cutout circuit 55 cuts out a predetermined pixel region from the input image signal as a class tap and supplies the cutout pixel region to the feature extraction circuit 56. The feature extraction circuit 56 extracts the feature of the blurred image based on the output of the region cutout circuit 55 and supplies the extracted feature to the class code generation circuit 57. The class code generation circuit 57 generates a class code based on the output of the feature extraction circuit 56 and supplies the generated class code to the statistical data ROM 58.
[0040]
The statistical data ROM 58 stores statistical data as will be described later for each class code, and supplies statistical data corresponding to the class code supplied from the class code generation circuit 57 to the predicted value determination circuit 54. The predicted value determination circuit 54 refers to the supplied statistical data, and determines the optimum one of the maximum predicted value and the minimum predicted value for each pixel position supplied from the predicted value selection circuit 53. Then, an output image signal is created based on the determination result. The processing at this time will be specifically described with reference to FIG. Here, an example of the signal waveform of the original image is indicated by a black circle. An example of the signal waveform of the predicted image signal generated by the predicted image generation unit 51 at a position corresponding to this example is indicated by three white circles. As the optimum predicted value, the white circle closest to the black circle may be selected. For this purpose, the maximum of the three predicted values is selected when the peak of the signal waveform is convex upward, and the minimum of the three predicted values is selected when the peak of the signal waveform is convex downward. A value may be selected. Further, since the level difference between the original image signal and the predicted image signal is small in the portion other than the peak, any of the three predicted values may be selected as the optimal predicted value. Considering the above situation, in one embodiment of the present invention, the predicted value selection circuit 53 selects the maximum and minimum predicted values from the three predicted values generated by the predicted image generation unit 51, and the maximum The prediction value determination circuit 54 selects an optimum output image signal from among the minimum predicted values with reference to statistical data as will be described later.
[0041]
In addition, as features for classification, which are output from the feature extraction circuit 56, for example, the sign and magnitude of the first derivative and the sign of the second derivative can be used. For example, in the case of using an example of a class tap structure as shown in FIG. 7 consisting of 5 horizontal taps centered on the pixel of interest, 1 for the sign of the primary differential D (ie, the sign of D) for each pixel position. 1 bit related to the magnitude of the first derivative D (that is, a quantized value obtained by quantizing the value of | D | by 1 bit ADRC), and 1 bit related to the sign of the second derivative E (that is, the sign of E) Assign bits. Therefore, the number of bits for 5 pixels is 3 × 5 = 15 bits, and the total number of classes is 2.¹⁵= 32768 classes. You may make it thin out about 5 horizontal taps used here. The primary differential D and the secondary differential E are calculated by the following formulas (9) and (10), respectively.
[0042]
In one embodiment of the present invention, it is assumed that the ADRC circuit 3 compresses the 5-pixel SD data separated by the area cut-out circuit 2 to 2 bits each. Hereafter, each compressed SD data is q₁~ Q_FiveIs written. These pattern compressed data are supplied to the class code generation circuit 6.
[0043]
D (i, j) = f (i, + j) −f (i, j−1) (9)
E (i, j)
= F (i, j + 1) + f (i, j-1) -2 * f (i, j) (10)
Here, i and j are coordinates representing the pixel position two-dimensionally, and f (i, j) is a pixel value at the pixel position. Here, the case where the class taps are arranged in two dimensions has been described as an example, but the present invention can also be applied to the case where the class taps are arranged in three dimensions (time-space). Note that 1-bit ADRC (Adaptive Dynamic Range Coding) is a process for expressing a temporal or spatial change pattern of some data with a small number of bits. When the dynamic range of | D | is DR, the bit allocation is n, the data level of | D | at each pixel position is L, and the requantization code is Q, the maximum value | D | And the minimum value min are equally divided by a designated bit length to perform requantization.
[0044]
DR = MAX-MIN + 1
Q = {(L−MIN + 0.5) × 2 / DR} (11)
However, {} means a truncation process.
[0045]
If the predicted value determination circuit 54 is configured to operate with reference to two types of statistical data, the optimal predicted value can be determined with higher accuracy. An example of such a configuration is shown in FIG. In FIG. 8, the same components as those in FIG. 5 are denoted by the same reference numerals. An example of such a configuration includes a first processing sequence including a region extraction circuit 55, a feature extraction circuit 56, a class code generation circuit 57, and a statistical data ROM 58 as a processing sequence related to the output of statistical data, and a region extraction circuit. 155, a feature extraction circuit 156, a class code generation circuit 157, and a second processing sequence including a statistical data ROM 158.
[0046]
Here, when an ambiguous determination is made as a result of the operation of the predicted value determination circuit 54 with reference to the first statistical data output from the statistical data ROM 58 by the process performed by the first series, The optimum predicted value is determined by a control method such as controlling the predicted value determining circuit 54 to operate by referring to the second statistical data output from the statistical data ROM 158 by the processing performed by the second series. Accuracy can be improved. In addition, by providing three or more series of configurations for generating class codes, and having three or more statistical data ROMs storing statistical data corresponding to each of the three or more types of class codes generated by these, It is also possible to further improve the accuracy of determining the optimum predicted value by causing the predicted value determining circuit 54 to operate by referring to three or more types of statistical data.
[0047]
Next, generation of statistical data referred to in the operation of the predicted value determination circuit 54 will be described with reference to FIG. An input image signal as a teacher signal is supplied to the LPF 61, the optimum predicted value selection circuit 64, and the residual calculation circuit 65. The LPF 61 performs an LPF process on the supplied image signal to generate a deteriorated image signal. The output of the LPF 61 is supplied to the predicted image generation unit 62 and the region cutout circuit 66.
[0048]
The predicted image generation unit 62 is configured to generate a predicted image for each thinning interval. As the predicted image generation unit 62, one having the same configuration as the predicted pixel generation unit 51 in FIGS. 5 and 8 is used. . For example, three predicted image signals that are outputs of the predicted image generation unit 62 are supplied to the first predicted value selection circuit 63. The predicted value selection circuit 63 selects the maximum predicted value MAX and the maximum predicted value MIN for each pixel position from the predicted values supplied from the predicted image generation unit 62 and supplies the selected predicted value MAX to the optimal predicted value output circuit 64. .
[0049]
The optimum predicted value output circuit 64 selects the output from the predicted value selection circuit 63 that has a small residual value with respect to the teacher signal, that is, the absolute value of the signal level difference, and optimizes the selected predicted value (that is, MAX or MIN). To the residual calculation circuit 65 as a predicted value. Further, the optimum predicted value output circuit 64 supplies the signal d count circuit 69 indicating whether the selected predicted value is MAX or MIN. The residual calculation circuit 65 calculates a residual based on the optimal prediction value supplied from the optimal prediction value selection circuit 64 and the teacher signal, and supplies the calculation value to the count circuit 69.
[0050]
On the other hand, the region cutout circuit 66 cuts out a predetermined pixel region from the output of the LPF 61 as a class tap and supplies the cutout pixel region to the feature extraction circuit 67. The feature extraction circuit 67 extracts the features of the image by performing the same operation as the feature extraction circuit 56 in FIG. Features extracted by the feature extraction circuit 67 are supplied to the class code generation circuit 68. The class code generation circuit 68 generates a class code based on the output of the feature extraction circuit 67 and supplies the generated class code to the count circuit 69. The region cutout circuit 66, feature extraction circuit 67, and class code generation circuit 68 are the same as the region cutout circuit 55, feature extraction circuit 56, and class code generation circuit 57 in FIG.
[0051]
The count circuit 69 transmits only pixels whose residuals supplied from the residual calculation circuit 65 are larger than a predetermined threshold value by the class code supplied from the class code generation circuit 68 and the signal d described above. Statistical data is generated by counting according to information whether the pixel is selected by the optimum predicted value selection circuit 64 as a maximum value or a minimum value. The generated statistical data is supplied to the memory 70 and stored therein. Here, whether or not a certain pixel is to be counted can be determined according to, for example, whether or not the residual relating to the pixel is larger than a predetermined threshold value. An example of statistical data is shown in FIG. In FIG. 10, when the class code indicates class 1, the pixel is the pixel when the maximum value is selected by the optimum prediction value selection circuit 64, and the residual between the pixel and the teacher signal is larger than the threshold value. When the maximum number is selected by the optimum predicted value output circuit 64 and the absolute value of the residual between the pixel and the teacher signal is greater than the threshold value, the count number is 10 1 is shown.
[0052]
In addition, when the class code indicates class 2, it is indicated that the count number of the pixel selected as the maximum value / minimum value and having a large residual between the pixel and the teacher signal is 2/15. ing. Similarly, when the class code indicates class 3, it is indicated that the count number of the pixel selected as the maximum value / minimum value and having a large residual between the pixel and the teacher signal is 5/10. Has been.
[0053]
In this way, the degree of contribution to the improvement of image quality in relation to conditions such as thinning intervals by obtaining statistical data by limiting only residual pixels, that is, counting only pixels whose residual is larger than a certain level. Are excluded, and only the part that contributes to the improvement of image quality is reflected in the statistical data.
[0054]
Further, the portion reflected in the statistical data may be limited by paying attention to an amount other than the residual for each pixel between the optimal prediction value and the teacher signal. For example, a configuration in which the portion reflected in the statistical data is limited by paying attention to the size of the dynamic range DR or the differential value among a plurality of predicted values for each thinning interval is also possible. Depending on what kind of restrictions are imposed at this time, a plurality of types of statistical data can be obtained. In this way, it is possible to generate a plurality of types of statistical data used in the configuration described above with reference to FIG. Further, the processing in the region extraction circuit 55, the region extraction circuit 155, the feature extraction circuit 56, and the feature extraction circuit 156 may be changed.
[0055]
The statistical data stored in the memory 70 is loaded into the statistical data ROM 58 shown in FIGS. The prediction value determination circuit 54 in FIGS. 5 and 8 is suitable for the one having the largest count value in the statistical data among the maximum pixel value and the minimum pixel value supplied from the prediction value selection circuit 53. The pixel value is determined. However, if the count values for the maximum pixel value and the minimum pixel value in the statistical data are in conflict with each other, if a suitable prediction pixel value is determined with reference to the statistical data, the reliability of the determination result It is thought that the nature is low. In such a case, if the determination result is incorrect, there is a possibility that a significant deterioration in image quality may occur.
[0056]
In view of this, when the count value for the maximum pixel value and the minimum pixel value in the statistical data is in competition with each other, the following process is performed, so that the prediction value determination circuit 54 can perform a suitable prediction. The reliability of the determination of the pixel value can be ensured. {Circle around (1)} In the configuration shown in FIG. 8, statistical data other than the statistical data is referred to in the operation of the predicted value determination circuit 54. (2) The average value of the maximum pixel value and the minimum pixel value is output from the predicted value determination circuit 54. Here, (1) and (2) are both examples and are not limited to these.
[0057]
In addition, when the count value for the maximum pixel value and the minimum pixel value in the statistical data is in competition with each other, for example, it can be determined that the following equation (12) holds.
[0058]
a / (ab) <threshould (12)
Here, a represents the higher frequency, and b represents the lower frequency. Threshould is a predetermined threshold value.
[0059]
The present invention is not limited to the above-described embodiment of the present invention, and various modifications and applications can be made without departing from the gist of the present invention.
[0060]
【The invention's effect】
As described above, according to the present invention, an optimum one is obtained by referring to statistical data from among the results of class classification adaptation processing based on a plurality of types of class tap structures and / or prediction tap structures having different thinning intervals. The selected image signal is used as an output image signal.
[0061]
For this reason, for each pixel position in the input image signal, using the result of class classification adaptation processing under a thinning interval that is more compatible with the degree of image degradation at the pixel position, features in the image, etc. An output image signal can be generated.
[0062]
In particular, according to the present invention, since an optimum thinning interval can be selected for each pixel, a larger image quality improvement can be realized as compared with the case where class classification adaptive processing is performed on the entire image signal under a certain thinning interval. be able to.
[0063]
In addition, by detecting the maximum value and minimum value from the results of class classification adaptation processing under different thinning intervals, and selecting the optimum value from the detected maximum value and minimum value Compared with the case where the optimum one is selected from all the results of class classification adaptation processing (for example, three predicted image signals) under different thinning intervals, it is easier and more accurate especially near the peak of the signal waveform. The optimal prediction image signal can be selected.
[0064]
Furthermore, in the process of obtaining the statistical data, more accurate statistical data can be obtained for improving the image quality by taking into account predetermined conditions related to the characteristics of the image signal.
[0065]
Further, if a configuration using a plurality of types of statistical data is used, a greater improvement in image quality can be realized.
[Brief description of the drawings]
FIG. 1 is a schematic diagram illustrating an example of a class tap structure at each thinning interval.
FIG. 2 is a block diagram illustrating an example of a configuration related to learning in a class classification adaptation process.
FIG. 3 is a block diagram showing another example of a configuration related to learning in the class classification adaptation process.
FIG. 4 is a block diagram illustrating an example of a configuration relating to prediction estimation in a class classification adaptive process.
FIG. 5 is a block diagram showing an example of a configuration related to prediction estimation in one embodiment of the present invention.
FIG. 6 is a schematic diagram illustrating an example of a signal waveform of an original image signal and a signal waveform of a predicted image signal.
FIG. 7 is a schematic diagram for explaining an example of class classification when referring to statistical data;
FIG. 8 is a block diagram showing another example of the configuration according to the embodiment of the present invention.
FIG. 9 is a block diagram showing an example of a configuration for obtaining statistical data according to an embodiment of the present invention.
FIG. 10 is a schematic diagram illustrating an example of statistical data.
[Explanation of symbols]
51 ... Predictive image output unit, 53 ... Predictive value selection circuit, 54 ... Output value determination circuit, 58 ... Statistical data ROM

Claims

Predicted image signal generation means for generating predicted pixel values from images obtained by thinning input image signals at different thinning intervals;
Optimal pixel value output means for selectively outputting one of a maximum value and a minimum value among a plurality of prediction pixel values output from the prediction image signal generation means based on the characteristics of the input image signal ;
The optimum pixel value output means includes:
Statistical data class tap extraction means for outputting, as the statistical data class tap, a pixel value at a predetermined position designated around the pixel of interest from the input image signal;
A statistical data class code generating means for generating a statistical data class code based on the output of the statistical data class tap extracting means;
Statistical data storage means for storing statistical data in advance and outputting statistical data corresponding to the output of the statistical data class code generating means;
On the basis of the output of the statistical data storing means, from among the maximum value and the minimum value, the image information converting apparatus according to claim Rukoto an output value determination unit for outputting an optimum value.

In claim 1,
The above-described different types of thinning intervals are
What is claimed is: 1. An image information conversion apparatus including a device corresponding to a case where thinning is not performed.

In claim 1,
An image information conversion apparatus using two or more kinds of statistical data.

A predicted image signal generating step for generating predicted pixel values from images obtained by thinning the input image signal at a plurality of different thinning intervals;
An optimal pixel value output step for selectively outputting one of a maximum value and a minimum value among a plurality of prediction pixel values generated by the prediction image signal generation step based on the characteristics of the input image signal ;
The optimal pixel value output step includes:
A statistical data class tap extraction step for outputting, as a statistical data class tap, a pixel value at a predetermined position designated around the pixel of interest from the input image signal;
A statistical data class code generation step for generating a statistical data class code based on the output of the statistical data class tap extraction step;
A statistical data output step of outputting statistical data corresponding to the statistical data class code generated by the statistical data class code generating step from the statistical data stored in advance;
On the basis of the output of the statistical data outputting step, the image information conversion method for from among the maximum and minimum values, characterized in Rukoto and an output value determination step of outputting an optimum value.

Signal degradation processing means for degrading the input image signal;
Predicted image signal generating means for generating predicted pixel values, respectively, from images obtained by thinning out images output from the signal degradation processing means at different thinning intervals;
An optimum pixel value output means for selectively outputting one of a maximum value and a minimum value among a plurality of prediction pixel values output from the prediction image signal generation means with reference to the input image signal;
Among the outputs of the optimal pixel value output means, the number of those satisfying a predetermined condition is classified into the maximum value or the minimum value of the pixel, and the image output from the signal deterioration processing means. A learning apparatus comprising: counting means for counting corresponding to the output of the means for generating information representing the feature.

In claim 5 ,
The optimum pixel value output means includes:
Among the plurality of predicted pixel values output from the predicted image signal generating means, the one having a smaller residual with the pixel value at the corresponding position in the input image is selectively selected from among the maximum and minimum values among the plurality of predicted pixel values. A learning device characterized by outputting.

In claim 5 ,
Means for generating information representing the characteristics of the image output from the signal degradation processing means,
A class tap extracting means for outputting a pixel value at a predetermined position designated around the pixel of interest from the image output from the signal degradation processing means as a statistical data class tap;
A learning apparatus comprising class code generation means for generating a class code based on an output of the class tap extraction means.

In claim 5 ,
The predetermined condition is as follows:
An image information conversion apparatus, wherein a residual between the optimum predicted pixel value and a pixel value at a corresponding position in the predetermined image signal is larger than a predetermined threshold value.

In claim 5 ,
The predetermined condition is as follows:
An image information conversion apparatus characterized by being related to a dynamic range value when expressing a plurality of predicted pixel values.

In claim 5 ,
The predetermined condition is as follows:
An image information conversion apparatus characterized by being related to a spatial differential value of the optimum predicted pixel value.

A signal degradation processing step for degrading the input image signal;
A predicted image signal generating step for generating predicted pixel values, respectively, from images obtained by thinning out the images generated by the signal degradation processing step at different thinning intervals;
An optimum pixel value output step for selectively outputting one of a maximum value and a minimum value among a plurality of prediction pixel values generated by the prediction image signal generation step with reference to the input image signal;
Among the results of the optimal pixel value output step, the number of those satisfying a predetermined condition is classified as to whether the pixel is the maximum value or the minimum value, and the image generated by the signal degradation processing step. And a step of counting corresponding to the output of the step of generating information representing the feature.