JP4250806B2

JP4250806B2 - Field frequency conversion device and conversion method

Info

Publication number: JP4250806B2
Application number: JP12729299A
Authority: JP
Inventors: 哲二郎近藤; 靖立平; 真史内田; 正明服部; 岳志宮井
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1999-05-07
Filing date: 1999-05-07
Publication date: 2009-04-08
Anticipated expiration: 2019-05-07
Also published as: JP2000324495A

Description

【０００１】
【発明の属する技術分野】
この発明は、入力画像に対して例えば解像度の向上等を目的とする画像情報変換を施すフィールド周波数変換装置および変換方法に関する。
【０００２】
【従来の技術】
フィールド周波数を例えば５０Ｈｚから６０Ｈｚに変換するフィールド周波数変換の方法として、フィールド間の動きを推定し、推定した動き量を用いてフィールド間に新しいフィールドを生成する処理が知られている。しかし、かかる方法においては、動き推定に失敗すると、処理結果に直接影響するという問題があった。また、かかる方法は、単なる補間処理であり、時間的または空間的な解像度を向上させることはできない。
【０００３】
また、本願出願人は、元の画像中の複数個のフィールドから抽出した画像データを使用して、クラス分類適応処理によって新たなフィールドを生成することにより、フィールド周波数を例えば５０Ｈｚから１００Ｈｚに変換する変換する方法を先に提案している。ここで、クラス分類適応処理は以下のような処理である。すなわち、入力画像から所定の範囲の画像データを切り出し、切り出した画像データのレベル分布を検出し、検出結果に基づいてクラス分類を行う。そして、分類されたクラス毎に予め決定されている予測係数と、入力画像から別途を切り出した所定範囲の画像データとに基づく演算の結果として例えばフィールド倍速化画像等の出力画像を得る。ここで、予測係数は、出力画像と同一の信号形式を有する画像（教師画像と称される）と、入力画像と同一の信号形式を有する画像（生徒画像と称される）とに基づく演算処理によって決定される。
【０００４】
【発明が解決しようとする課題】
しかし、このような方法では、入力画像の動き量が抽出される画像データの範囲より大きい場合には、的確なクラス分類適応処理を行うことができないので、正しいフィールドを生成することができない。
【０００５】
従って、この発明の目的は、特に入力画像の動き量が大きい場合等において、フィールド周波数変換をより的確に行うことが可能なフィールド周波数変換装置および変換方法を提供することにある。
【０００６】
【課題を解決するための手段】
請求項１の発明は、入力画像のフィールド間に新たなフィールドを生成することによって
入力画像のフィールド数より多いフィールド数の出力画像を形成するフィールド周波
数変換装置において、
入力画像の処理の対象である注目画素の動きベクトルを検出する動き量推定手段と、
動きベクトルによって、入力画像に対する動き補償処理を行う動き補償手段と、
動きベクトルの値の大きさによって、動きクラスを決定する動きクラス決定手段と、
動き補償手段の出力から、注目画素の周辺の複数の画像データを切り出す第１の画像切り出し手段と、
第１の画像切り出し手段によって切り出される複数の画像データのレベル分布を検出し、検出したレベル分布に基づいて空間クラスを決定する空間クラス決定手段と、
動きクラス決定手段からの動きクラスと、空間クラス決定手段からの空間クラスとを合成してクラスを決定するクラス決定手段と、
動き補償手段の出力から、注目画素の周辺の複数の画像データを切り出す第２の画像切り出し手段と、
クラスのそれぞれに対応して予め決定され、出力画像信号を推定するための予測係数を記憶し、記憶している予測係数の内から、クラス決定手段からのクラスに対応するものを出力する係数記憶手段と、
第２の画像切り出し手段によって切り出される複数の画像データと、係数記憶手段から供給される予測係数との積和演算を行い、上記出力画像の画像データの予測値を生成する演算処理手段とを有し、
予測係数は、積和演算によって、出力画像信号の画素を生成した時に、生成された値と画素の真値との誤差を最小にするように、クラス毎に予め学習によって求められて係数記憶手段に記憶される
ことを特徴とするフィールド周波数変換装置である。
【０００７】
請求項５の発明は、入力画像のフィールド間に新たなフィールドを生成することによって入力画像のフィールド数より多いフィールド数の出力画像を形成するフィールド周波数変換方法において、
入力画像の処理の対象である注目画素の動きベクトルを検出する動き量推定ステップと、
動きベクトルによって、入力画像に対する動き補償処理を行う動き補償ステップと、
動きベクトルの値の大きさによって、動きクラスを決定する動きクラス決定ステップと、
動き補償ステップの出力から、注目画素の周辺の複数の画像データを切り出す第１の画像切り出しステップと、
第１の画像切り出しステップによって切り出される複数の画像データのレベル分布を検出し、検出したレベル分布に基づいて空間クラスを決定する空間クラス決定ステップと、
動きクラス決定ステップからの動きクラスと、空間クラス決定ステップからの空間クラスとを合成してクラスを決定するクラス決定ステップと、
動き補償ステップの出力から、注目画素の周辺の複数の画像データを切り出す第２の画像切り出しステップと、
クラスのそれぞれに対応して予め決定され、出力画像信号を推定するための予測係数を記憶し、記憶している予測係数の内から、クラス決定ステップからのクラスに対応するものを出力する係数記憶ステップと、
第２の画像切り出しステップによって切り出される複数の画像データと、係数記憶ステップから供給される予測係数との積和演算を行い、上記出力画像の画像データの予測値を生成する演算処理ステップとを有し、
予測係数は、積和演算によって、出力画像信号の画素を生成した時に、生成された値と画素の真値との誤差を最小にするように、クラス毎に予め学習によって求められて係数記憶ステップに記憶される
ことを特徴とするフィールド周波数変換方法である。
【０００８】
以上のような発明によれば、クラス分類適応処理を行うに際して、入力画像の動き量を反映させることができる。
【０００９】
【発明の実施の形態】
以下、適宜図面を参照してこの発明の一実施形態について説明する。この発明の一実施形態におけるマッピング、すなわちフィールド周波数を変換する処理に係る構成の一例を図１に示す。かかる構成は、例えば、５０Ｈｚの入力画像を１００Ｈｚの出力画像に変換する処理を行うものである。入力画像が動き推定部１１に供給される。動き推定部１１は、例えばブロックマッチング等の方法によって、入力画像内の処理の対象である注目画素の動きベクトルを推定し、推定した動きベクトルを動き補償部１２および動きクラス決定部１３に供給する。動き補償部１２は、供給される動きベクトルに基づいて入力画像のフィールドをずらす動き補償処理を行う。動き補償処理の結果として生成される画像がクラスタップ選択部１４と予測タップ選択部１５とに供給される。
【００１０】
一方、動きクラス決定部１３には、動きベクトルと共に、その信頼性を示す情報が動き推定部１１から供給される。動きクラス決定部１３は、供給される動きベクトルと信頼性を示す情報とに基づいて動きクラスを決定し、決定した動きクラスを示す情報をクラスタップ選択部１４、予測タップ選択部１５およびクラス決定部１７に供給する。クラスタップ選択部１４は、動きクラスを参照して空間クラスの分類に用いる所定位置の画素（クラスタップと称される）を選択的に抽出し、抽出したクラスタップのデータを空間クラス決定部１６に供給する。空間クラス決定部１６は、供給されるデータに基づいてＡＤＲＣ(Adaptive Dynamic Range Coding) 等を含む処理を行うことによって空間クラスを決定し、決定した空間クラスを示す情報をクラス決定部１７に供給する。
【００１１】
クラス決定部１７は、空間クラス決定部１６から供給される空間クラスを示す情報と、上述したように動きクラス決定部１３から供給される動きクラスを示す情報とに基づいて最終的なクラスを決定する。クラス決定部１７は、決定した最終的なクラスを示す情報を予測係数選択部１８に供給する。予測係数選択部１８は、クラス決定部１７の出力を参照して、最終的なクラスに対応する予測係数を出力する。この予測係数が積和演算部１９に供給される。なお、予測係数選択部１８は、クラスに対応して後述するようにして予め決定された予測係数を供給され、供給される予測係数を保持するメモリを有している。
【００１２】
一方、予測タップ選択部１５は、動きクラス決定部１３から供給される動きクラスを参照して、動き補償部１２の出力から所定の画素領域（予測タップと称される）を選択的に抽出する。抽出された予測タップのデータが積和演算部１９に供給される。積和演算部１９は、予測タップのデータと、予測係数選択部１８から供給される予測係数とに基づいて、以下の式（１）に従う積和演算を行うことにより、フィールド周波数が変換された出力画像を生成する。
【００１３】
ｙ＝ｗ₁×ｘ₁＋ｗ₂×ｘ₂＋‥‥＋ｗ_n×ｘ_n （１）
ここで、ｘ₁，‥‥，ｘ_nが各予測タップの画素データであり、ｗ₁，‥‥，ｗ_nが各予測係数である。
【００１４】
次に、動き推定部１１の動作について詳細に説明する。動き推定部１１は、例えばブロックマッチング等の方法によってフレーム間の動きベクトルを推定する。ブロックマッチングの概要について図２を参照して説明する。現在フレームＦ１内のｍ×ｎ画素からなる参照ブロックＢ１内の画像と、過去フレームＦ２内に設定したｓ×ｔ画素からなる探索範囲Ｓ１中のブロックＢ１と同形の候補ブロックＢ２との間でマッチング演算を行う。すなわち、参照ブロックＢ１と候補ブロックＢ２との間で対応する位置の画素値の差分をとり、差分の絶対値をブロックＢ２の全体に渡って累積する等の処理によって候補ブロックＢ２についての評価値を作成する。
【００１５】
このような評価値を探索範囲Ｓ１中の全候補ブロックについて作成し、評価値が最小となる候補ブロックの位置を最もマッチングの良い候補ブロックの位置として決定することにより、参照ブロックＢ１に対応する動きベクトルを検出する。探索範囲Ｓ１内の候補ブロックとして１画素ずつずれたブロックを用いる場合には、全部でｓ×ｔ個の候補ブロックを取扱うことになる。なお、参照ブロックを過去フレーム内にとり、探索範囲を現在フレーム内に設定するようにしても良い。ブロックマッチングについては、本願出願人の先の提案（例えば特開昭５４−１２４９２７号公報参照）に詳細に開示されている。参照ブロック、探索範囲の大きさ等は動き推定の対象とされる画像の性質等の条件に応じて適切に設定すれば良い。この発明の一実施形態では、例えば、参照ブロックのブロックサイズが横６画素×縦３画素とされ、また、探索範囲が水平方向のみに±１６画素とされる。
【００１６】
また、この発明の一実施形態では、上述したようにして推定される動きベクトルの信頼性を以下のようにして判定する。すなわち、評価値の最小値が例えば１８０等の所定値以上となる場合に信頼性が低いと判定し、動きベクトルを無効とする。動きベクトルが無効とされる場合には、動きベクトルとして０が出力される。
【００１７】
動き補償部１２、クラスタップ選択部１４および予測タップ選択部１５においては、入力画像内の画素と出力画像内の画素との位置関係によって決まるモードに応じた処理がなされる。まず、モードについて、図３および図４を参照して説明する。入力画像内の画素と出力画像内の画素との位置関係の一例を図３に説明する。図３において、水平方向は時間方向を示し、垂直方向は画像内での垂直方向を示す。従って、垂直方向の画素の並びがフィールドを表している。また、黒丸は入力画像内の画素を示し、白丸は出力画像内の画素を示す。図４から、入力画像内の画素と出力画像内の画素との間に複数種類の位置関係があることがわかる。
【００１８】
このような位置関係について図４により詳細に示す。ここで、各モード毎に１個の出力画素を、代表例として、薄墨を付して示した。出力画像内の画素が入力画像内のフィールド上にある場合に、出力画像内の画素が入力画像内の画素とが同一位置となるようなモード（モード０）と、出力画像内の画素が入力画像内の画素の間にあるようなモード（モード３）とがある。また、出力画像内の画素が入力画像内のフィールドの間に生成されるフィールド上にある場合に、当該フィールドに対して時間的に直前に位置する入力画像内のフィールド上に出力画像内の画素と垂直方向の位置が一致する画素があるようなモード（モード１）と、当該フィールドに対して時間的に直後に位置する入力画像内のフィールド上に出力画像内の画素と垂直方向の位置が一致する画素があるようなモード（モード２）とがある。
【００１９】
動き補償部１２は、出力画像内の画素が入力画像内のフィールド上にある場合（モード０およびモード３）と、出力画像内の画素が入力画像内のフィールドの間に生成されるフィールド上にある場合（モード１およびモード２）とで異なる処理を行う。このような処理について図５および図６を参照して説明する。図５にモード０およびモード３における処理の一例を示す。ここで、縦方向が時間を示し、横方向が各フィールド内の水平方向の位置を示す。また、生成すべきフィールドが時点Ｎにおけるフィールドであるｆｉｅｌｄ（Ｎ）と同一の時間位置にある場合を例として説明する。この場合、正方形で示す入力画像内の画素の位置と、注目画素（互いに交差する２本の斜線で示す）の位置とが一致している。
【００２０】
また、図５では、ｆｉｅｌｄ（Ｎ）と、時点Ｎ＋２におけるフィールドであるｆｉｅｌｄ（Ｎ＋２）との間で推定された動きベクトルをｍｅ＿ｘと表記する。この場合に、動き補償としてフィールド（Ｎ−１）およびフィールド（Ｎ＋１）を水平方向にそれぞれｍｅ＿ｘ／２、−ｍｅ＿ｘ／２だけ引き寄せる処理が行われる。これにより、フィールド（Ｎ−１）およびフィールド（Ｎ＋１）において、水平方向の動きが見かけ上ほぼ打ち消された画像を得ることができる。
【００２１】
一方、図６にモード１およびモード２における処理の一例を示す。ここで、縦方向が時間を示し、横方向が各フィールド内の水平方向の位置を示す。また、生成すべきフィールドが時点Ｎにおけるフィールドであるｆｉｅｌｄ（Ｎ）と時点Ｎ＋１におけるフィールドであるｆｉｅｌｄ（Ｎ＋１）との間に位置する場合を例として説明する。この場合、正方形で示す入力画像内の画素と、注目画素の位置（互いに交差する２本の斜線で示す）の位置とは異なる。
【００２２】
図６においても、図５と同様に、ｆｉｅｌｄ（Ｎ）と、時点Ｎ＋２におけるフィールドであるｆｉｅｌｄ（Ｎ＋２）との間で推定された動きベクトルをｍｅ＿ｘと表記する。この場合に、動き補償として、フィールド（Ｎ−１）、ｆｉｅｌｄ（Ｎ）およびフィールド（Ｎ＋１）を水平方向にそれぞれ、３×ｍｅ＿ｘ／４、ｍｅ＿ｘ／４、および−ｍｅ＿ｘ／４、だけ引き寄せる処理が行われる。このような処理によって、フィールド（Ｎ−１）およびフィールド（Ｎ＋１）において水平方向の動きが見かけ上ほぼ打ち消された画像を得ることができる。
【００２３】
次に、動きクラス決定部１３による処理について説明する。動きクラス決定部１３は、上述したように動き推定部１１から、動きベクトルと動きベクトルの信頼性を示す情報とを供給される。これらに基づいて、動きクラスを以下のように決定する。
【００２４】
動きクラス０：動きベクトルが有効で動きベクトル値が０
動きクラス１：動きベクトルが有効で動きベクトル値の絶対値が６以下
動きクラス２：動きベクトルが有効で動きベクトル値の絶対値が７以上
動きクラス３：動きベクトルが無効（この時は動きベクトル値は０とされる）ここで、動きベクトルの信頼性が低いと判定される場合（上述したようにブロックマッチングにおける評価値の最小値が所定値以上となる場合）に動きベクトルが無効とされ、それ以外の場合は動きベクトルが有効とされる。また、動きクラス１と動きクラス２を判定する際の参照値とされている６、７等の値は一例であり、これに限定されるものではない。一般的には探索範囲の大きさ（例えば水平方向に±１６画素等）、入力画像の性質等を考慮して適切な値を参照するようにすれば良い。なお、動きクラス３は、信頼性の低い動きベクトルに基づいて不適切な動き補償が行われることを回避するためのものである。
【００２５】
次に、クラスタップ選択部１４および予測タップ選択部１５の動作について説明する。クラスタップ選択部１４および予測タップ選択部１５は、モードと動きクラスとに応じて所定位置の画素をクラスタップおよび予測タップとして抽出する。モード０、１、２、３に対応するタップ構造の一例を図７、図８、図９および図１０に示す。図７〜図１０において、動きクラス０、１の時にクラスタップまたは予測タップとして抽出される画素を黒丸で示し、動きクラス２、３の時にクラスタップまたは予測タップとして抽出される画素を白丸で示した。また、クラスタップまたは予測タップとして抽出される画素以外の画素は、全て点線の丸で示した。
【００２６】
モード０におけるタップ構造の一例を図７に示す。図７Ａに示すように、動きクラス０、１と動きクラス２、３とでクラスタップ構造が一致する。すなわち、何れの動きクラスにおいても、現在フィールドから５個、現在フィールドの１フィールド後のフィールドから２個の計７個の画素がクラスタップとして抽出される。また、図７Ｂに示すように、予測タップ構造が動きクラス０、１と動きクラス２、３とで一致する。すなわち、現在フィールドから９個、現在フィールドの１フィールド前および１フィールド後の各フィールドからそれぞれ２個の計１３個の画素が予測タップとして抽出される。
【００２７】
モード１におけるタップ構造の一例を図８に示す。動きクラス０、１の場合には、図８Ａにて黒丸で示す位置の画素（すなわち、現在フィールドから４個、現在フィールドの１フィールド前および１フィールド後の各フィールドからそれぞれ１個および３個の計８個）がクラスタップとして抽出される。また、動きクラス２、３の場合には、図８Ａにて白丸で示す位置の画素（現在フィールドから４個、現在フィールドの１フィールド後の各フィールドから５個の計８個）がクラスタップとして抽出される。一方、動きクラス０、１の場合、図８Ｂにて黒丸で示す位置の画素（現在フィールドから８個、現在フィールドの１フィールド前および１フィールド後の各フィールドからそれぞれ３個の計１４個）が予測タップとして抽出される。また、動きクラス２、３の場合には、図８Ｂにて白丸で示す位置の画素（現在フィールドから８個、現在フィールドの１フィールド後の各フィールドから５個の計１３個）が予測タップとして抽出される。
【００２８】
モード２におけるタップ構造の一例を図９に示す。図９Ａに示すように、動きクラス０、１と動きクラス２、３とでクラスタップ構造が一致する。すなわち、何れの動きクラスにおいても、現在フィールドから５個、現在フィールドの１フィールド後のフィールドから２個の計７個の画素がクラスタップとして抽出される。一方、動きクラス０、１の場合、図９Ｂにて黒丸で示す位置の画素（現在フィールドから９個、現在フィールドの１フィールド前および１フィールド後の各フィールドからそれぞれ２個の計１３個）が予測タップとして抽出される。また、動きクラス２、３の場合には、図８Ｂにて白丸で示す位置の画素（現在フィールドから９個、現在フィールドの１フィールド後のフィールドから６個の計１５個）が予測タップとして抽出される。
【００２９】
モード３におけるタップ構造の一例を図１０に示す。動きクラス０、１の場合には、図１０Ａにて黒丸で示す位置の画素（すなわち、現在フィールドから２個、現在フィールドの１フィールド前および１フィールド後の各フィールドからそれぞれ３個の計８個）がクラスタップとして抽出される。また、動きクラス２、３の場合には、図８Ａにて白丸で示す位置の画素（現在フィールドから６個、現在フィールドの１フィールド前および１フィールド後の各フィールドから１個の計８個）がクラスタップとして抽出される。一方、図１０Ｂに示すように、動きクラス０、１と動きクラス２、３とで予測タップ構造が一致する。すなわち、何れの動きクラスにおいても、現在フィールドから８個、現在フィールドの１フィールド前および１フィールド後の各フィールドからそれぞれ３個の計１４個の画素がクラスタップとして抽出される。
【００３０】
次に、学習、すなわち上述したマッピングを行うに際して使用される予測係数の算出について説明する。図１における入力画像と同一の信号形式を有する画像が生徒画像として動き推定部２１および動き補償部２２に供給される。動き推定部２１は、図１中の動き推定部１１と同様な処理を行う。すなわち、動き推定部２１は、生徒画像内の注目画素の動きベクトルを推定し、推定した動きベクトルを動き補償部２２に供給する。また、動き推定部２１は、動きベクトルと共に、その信頼性を示す情報を動きクラス決定部２３に供給する。一方、動き補償部２２は、図１中の動き補償部１２と同様な動き補償処理を行う。この動き補償処理の結果として生成される画像がクラスタップ選択部２４と予測タップ選択部２５とに供給される。
【００３１】
動きクラス決定部２３は、図１中の動きクラス決定部１３と同様な処理を行って動きクラスを決定し、決定した動きクラスを示す情報をクラスタップ選択部２４、予測タップ選択部２５およびクラス決定部２７に供給する。クラスタップ選択部２４は、図１中のクラスタップ選択部１４と同一位置の画素をクラスタップとして抽出し、抽出したクラスタップのデータを空間クラス決定部２６に供給する。空間クラス決定部２６は、供給されるデータに基づいて図１中の空間クラス決定部１６と同様な処理を行うことによって空間クラスを決定し、決定した空間クラスを示す情報をクラス決定部２７に供給する。
【００３２】
クラス決定部２７は、図１中のクラス決定部１７と同様な処理を行うことにより、最終的なクラスを決定し、最終的なクラスを示す情報をマトリクス選択部２８に供給する。マトリクス選択部２８は、最終的なクラスに対応するマトリクスを選択し、選択したマトリクスに係るデータをマトリクス加算部２９に供給する。
【００３３】
一方、予測タップ選択部２５は、図１中の予測タップ選択部１５と同一位置の画素を予測タップとして抽出し、抽出した予測タップのデータをマトリクス加算部２９に供給する。マトリクス加算部２９には、さらに、図１における出力画像と同一の信号形式の画像が教師画像として供給される。マトリクス加算部２９には、マトリクス選択部２８から供給されるデータに、予測タップのデータおよび教師画像に基づく演算結果を足し込む処理を行うことにより、正規方程式のデータを生成する。正規方程式のデータは、マトリクス加算部２９から係数決定部３０に供給される。係数決定部３０は、正規方程式を解く演算を行うことにより、予測係数を算出する。算出された予測係数は、例えば図示しないメモリに一旦記憶され、図１中の予測係数選択部内のメモリにロードされる等の方法により、図１を参照して上述した演算処理において使用されることが可能となる。
【００３４】
次に、予測係数を算出するための演算について説明する。上述の式（１）において、学習前は予測係数ｗ₁，‥‥，ｗ_nが未定係数である。学習は、クラス毎に複数の教師画像を入力することによって行う。教師画像の種類数をｍと表記する場合、式（１）から、以下の式（２）が設定される。
【００３５】
ｙ_k＝ｗ₁×ｘ_k1＋ｗ₂×ｘ_k2＋‥‥＋ｗ_n×ｘ_kn （２）
（ｋ＝１，２，‥‥，ｍ）
ｍ＞ｎの場合、予測係数ｗ₁，‥‥，ｗ_nは一意に決まらないので、誤差ベクトルｅの要素ｅ_kを以下の式（３）で定義して、式（４）によって定義される誤差ベクトルｅを最小とするように予測係数を定めるようにする。すなわち、いわゆる最小２乗法によって予測係数を一意に定める。
【００３６】
ｅ_k＝ｙ_k−｛ｗ₁×ｘ_k1＋ｗ₂×ｘ_k2＋‥‥＋ｗ_n×ｘ_kn｝（３）
（ｋ＝１，２，‥‥ｍ）
【００３７】
【数１】

【００３８】
式（４）のｅ²を最小とする予測係数を求めるための実際的な計算方法としては、ｅ²を予測係数ｗ_i(i=1,2‥‥）で偏微分し（式（５））、ｉの各値について偏微分値が０となるように各予測係数ｗ_iを定めれば良い。
【００３９】
【数２】

【００４０】
式（５）から各予測係数ｗ_iを定める具体的な手順について説明する。式（６）、（７）のようにＸ_ji，Ｙ_iを定義すると、式（５）は、式（８）の行列式の形に書くことができる。
【００４１】
【数３】

【００４２】
【数４】

【００４３】
【数５】

【００４４】
式（８）が一般に正規方程式と呼ばれるものである。マトリクス加算部２９は、正規方程式（８）中のパラメータを算出する。係数決定部３０は、算出されたパラメータに基づいて掃き出し法等の一般的な行列解法に従って正規方程式（８）を解くことにより、予測係数ｗ_i（ｉ＝１，２，‥‥，ｎ）を算出する。
【００４５】
上述したこの発明の一実施形態では、動き推定部１１による処理結果として得られる動きベクトルがクラスタップ選択、予測タップ選択およびクラス決定回路１７におけるクラス分類にも反映するようになされているが、フィールド周波数変換処理に対する動きベクトルの反映のさせ方はこれに限定されるものではない。例えば、動きベクトルを動き補償にのみ使用する構成、動きベクトルを動き補償とクラスタップおよび／または予測タップの選択に使用する構成、動きベクトルを動き補償とクラス分類に使用する構成等によっても、入力画像の動きが大きい場合にフィールド周波数変換に変換性能をある程度向上させることが可能である。
【００４６】
一般的に、より多くの構成要素の動作に動きベクトルが反映されるように構成する程、入力画像の動きが大きい場合の変換性能は良くなるが、回路構成は大型化すると考えられる。従って、この発明の適用に際しては、装置に要求される、変換性能、回路規模、コスト等の条件により適合する構成用いるようにすれば良い。
【００４７】
この発明は、上述したこの発明の一実施形態に限定されるものでは無く、この発明の主旨を逸脱しない範囲内で様々な変形や応用が可能である。
【００４８】
【発明の効果】
この発明によれば、入力画像における動き量に基づいて動き補償された画像に対してクラス分類適応処理を適用することにより、フィールド周波数変換がなされる。このため、動き量が大きい場合等においても、変換処理の精度を向上させることができる。
【００４９】
特に、クラス分類適応処理における演算に使用される画像データを切り出す際に、動き量に応じた処理を行うようにすれば、変換処理の精度をより向上させることができる。
【００５０】
また、注目画素の周辺における画像データのレベル分布による空間クラスに加えて、動き量による動きクラスを合成したクラスを形成すれば、変換処理の精度をより向上させることができる。
【図面の簡単な説明】
【図１】この発明の一実施形態における、フィールド周波数変換処理に係る構成について説明するためのブロック図である。
【図２】ブロックマッチングについて説明するための略線図である。
【図３】入力画像内の画素と出力画像内の画素との位置関係について説明するための略線図である。
【図４】モードについて説明するための略線図である。
【図５】モード０およびモード３における動き補償処理について説明するための略線図である。
【図６】モード１およびモード２における動き補償処理について説明するための略線図である。
【図７】モード０におけるタップ構造について説明するための略線図である。
【図８】モード１におけるタップ構造について説明するための略線図である。
【図９】モード２におけるタップ構造について説明するための略線図である。
【図１０】モード３におけるタップ構造について説明するための略線図である。
【図１１】この発明の一実施形態における、学習に係る処理について説明するためのブロック図である。
【符号の説明】
１１・・・動き推定部、１２・・・動き補償部、２１・・・動き推定部、２２・・・動き補償部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a field frequency conversion apparatus and a conversion method for performing image information conversion on an input image for the purpose of improving resolution, for example.
[0002]
[Prior art]
As a field frequency conversion method for converting a field frequency from, for example, 50 Hz to 60 Hz, a process of estimating a motion between fields and generating a new field between fields using the estimated motion amount is known. However, such a method has a problem that if the motion estimation fails, the processing result is directly affected. Further, such a method is merely an interpolation process, and the temporal or spatial resolution cannot be improved.
[0003]
  In addition, the applicant of the present invention uses the image data extracted from a plurality of fields in the original image, and generates a new field by the class classification adaptive process, thereby adjusting the field frequency.For example, from 50Hz to 100HzThe conversion method to convert is proposed previously.Here, the class classification adaptive processing is the following processing. That is, a predetermined range of image data is cut out from the input image, the level distribution of the cut-out image data is detected, and classification is performed based on the detection result. Then, an output image such as a field-doubled image is obtained as a result of a calculation based on a prediction coefficient determined in advance for each classified class and a predetermined range of image data cut out from the input image. Here, the prediction coefficient is an arithmetic process based on an image having the same signal format as the output image (referred to as a teacher image) and an image having the same signal format as the input image (referred to as a student image). Determined by.
[0004]
[Problems to be solved by the invention]
However, in such a method, when the amount of motion of the input image is larger than the range of image data to be extracted, the correct class classification adaptation process cannot be performed, and thus a correct field cannot be generated.
[0005]
Accordingly, an object of the present invention is to provide a field frequency conversion device and a conversion method capable of performing field frequency conversion more accurately, particularly when the amount of motion of an input image is large.
[0006]
[Means for Solving the Problems]
  The invention of claim 1By creating a new field between the fields of the input image
Field frequency that forms an output image with more fields than the number of fields in the input image
In the number converter,
  Input imageThe motion vector of the pixel of interest that is the target of processingMotion estimation means;
  MovementBy vectorMotion compensation means for performing motion compensation processing on the input image;
  MovementDepending on the magnitude of the vector valueMotion class determining means for determining a motion class;
  From the output of the motion compensation means,Multiple pixels around the pixel of interestFirst image cutting means for cutting image data;
  Cut out by the first image cutout meanspluralimage dataofFor levelClothDetected and detectedLevel distributionOn the basis of thespaceDetermine classspaceClass decision means;
  Class determining means for determining a class by combining the motion class from the motion class determining means and the space class from the space class determining means;
  From the output of the motion compensation means,Multiple pixels around the pixel of interestA second image cutting means for cutting image data;
  classEach ofPre-determined corresponding toFor estimating the output image signalStores prediction coefficients, and class determination means from the stored prediction coefficientsClass fromCoefficient storage means for outputting the one corresponding to
  Cut out by the second image cutting meanspluralImage data and prediction coefficients supplied from coefficient storage means;Sum of productsPerform operationGenerate a predicted value of the image data of the output imageArithmetic processing meansAnd
  The prediction coefficient is obtained by learning in advance for each class so as to minimize an error between the generated value and the true value of the pixel when the pixel of the output image signal is generated by the product-sum operation. Remembered
This is a field frequency converter characterized by the above.
[0007]
  The invention of claim 5In a field frequency conversion method for forming an output image having a number of fields larger than the number of fields of the input image by generating a new field between the fields of the input image,
  Input imageThe motion vector of the pixel of interest that is the target of processingA motion estimation step;
  MovementBy vectorA motion compensation step for performing motion compensation processing on the input image;
  MovementDepending on the magnitude of the vector valueA motion class determining step for determining a motion class;
  From the output of the motion compensation step,Multiple pixels around the pixel of interestA first image cutting step for cutting image data;
  Cut out by the first image cutout steppluralimage dataofFor levelClothDetected and detectedLevel distributionOn the basis of thespaceDetermine classspaceA class determination step;
  A class determination step for determining a class by combining the motion class from the motion class determination step and the space class from the space class determination step;
  From the output of the motion compensation step,Multiple pixels around the pixel of interestA second image cutting step for cutting image data;
  classEach ofPre-determined corresponding toFor estimating the output image signalThe prediction coefficient is stored, and the class determination step is performed from the stored prediction coefficients.Class fromA coefficient storage step for outputting the one corresponding to
  Cut out by the second image cutout steppluralImage data and prediction coefficients supplied from the coefficient storage stepSum of productsPerform operationGenerate a predicted value of the image data of the output imageArithmetic processing stepAnd
  The prediction coefficient is obtained by learning for each class in advance so as to minimize an error between the generated value and the true value of the pixel when the pixel of the output image signal is generated by the product-sum operation. Remembered
This is a field frequency conversion method characterized by the above.
[0008]
According to the invention as described above, the amount of motion of the input image can be reflected when performing the class classification adaptive processing.
[0009]
DETAILED DESCRIPTION OF THE INVENTION
  An embodiment of the present invention will be described below with reference to the drawings as appropriate. FIG. 1 shows an example of a configuration relating to mapping in one embodiment of the present invention, that is, processing for converting a field frequency. Such a configuration is, for example, an input image of 50 Hz.10A process of converting to an output image of 0 Hz is performed. An input image is supplied to the motion estimation unit 11. The motion estimator 11 uses, for example, a method such as block matching in the input image.The target of processingThe motion vector of the target pixel is estimated, and the estimated motion vector is supplied to the motion compensation unit 12 and the motion class determination unit 13. The motion compensation unit 12 performs a motion compensation process for shifting the field of the input image based on the supplied motion vector. An image generated as a result of the motion compensation process is supplied to the class tap selection unit 14 and the prediction tap selection unit 15.
[0010]
On the other hand, the motion class determination unit 13 is supplied with information indicating the reliability along with the motion vector from the motion estimation unit 11. The motion class determination unit 13 determines a motion class based on the supplied motion vector and information indicating reliability, and class tap selection unit 14, prediction tap selection unit 15, and class determination information indicating the determined motion class. Supply to unit 17. The class tap selection unit 14 refers to the motion class and selectively extracts a pixel (referred to as a class tap) at a predetermined position used for the classification of the space class, and uses the extracted class tap data as the space class determination unit 16. To supply. The space class determining unit 16 determines a space class by performing processing including ADRC (Adaptive Dynamic Range Coding) based on the supplied data, and supplies information indicating the determined space class to the class determining unit 17. .
[0011]
The class determining unit 17 determines a final class based on the information indicating the space class supplied from the space class determining unit 16 and the information indicating the motion class supplied from the motion class determining unit 13 as described above. To do. The class determination unit 17 supplies information indicating the determined final class to the prediction coefficient selection unit 18. The prediction coefficient selection unit 18 refers to the output of the class determination unit 17 and outputs a prediction coefficient corresponding to the final class. This prediction coefficient is supplied to the product-sum operation unit 19. Note that the prediction coefficient selection unit 18 is supplied with a prediction coefficient determined in advance as described later corresponding to the class, and has a memory for holding the supplied prediction coefficient.
[0012]
On the other hand, the prediction tap selection unit 15 refers to the motion class supplied from the motion class determination unit 13 and selectively extracts a predetermined pixel region (referred to as a prediction tap) from the output of the motion compensation unit 12. . The extracted prediction tap data is supplied to the product-sum operation unit 19. The product-sum operation unit 19 converts the field frequency by performing a product-sum operation according to the following equation (1) based on the prediction tap data and the prediction coefficient supplied from the prediction coefficient selection unit 18. Generate an output image.
[0013]
y = w₁X₁+ W₂X₂+ ... + w_nX_n      (1)
Where x₁, ..., x_nIs the pixel data of each prediction tap, and w₁, ..., w_nAre each prediction coefficient.
[0014]
Next, the operation of the motion estimation unit 11 will be described in detail. The motion estimation unit 11 estimates a motion vector between frames by a method such as block matching. An outline of block matching will be described with reference to FIG. Matching between the image in the reference block B1 consisting of m × n pixels in the current frame F1 and the candidate block B2 having the same shape as the block B1 in the search range S1 consisting of s × t pixels set in the past frame F2 Perform the operation. That is, the evaluation value for the candidate block B2 is obtained by processing such as taking a difference in pixel values at corresponding positions between the reference block B1 and the candidate block B2 and accumulating the absolute value of the difference over the entire block B2. create.
[0015]
Such an evaluation value is generated for all candidate blocks in the search range S1, and the position corresponding to the reference block B1 is determined by determining the position of the candidate block having the smallest evaluation value as the position of the candidate block with the best matching. Detect vectors. When blocks shifted by one pixel are used as candidate blocks in the search range S1, s × t candidate blocks are handled in total. The reference block may be set in the past frame and the search range may be set in the current frame. The block matching is disclosed in detail in the previous proposal of the applicant of the present application (for example, see Japanese Patent Application Laid-Open No. 54-124927). What is necessary is just to set a reference block, the magnitude | size of a search range, etc. suitably according to conditions, such as the property of the image used as the object of motion estimation. In one embodiment of the present invention, for example, the block size of the reference block is 6 horizontal pixels × 3 vertical pixels, and the search range is ± 16 pixels only in the horizontal direction.
[0016]
In one embodiment of the present invention, the reliability of the motion vector estimated as described above is determined as follows. That is, when the minimum value of the evaluation value is equal to or greater than a predetermined value such as 180, it is determined that the reliability is low, and the motion vector is invalidated. When the motion vector is invalidated, 0 is output as the motion vector.
[0017]
In the motion compensation unit 12, the class tap selection unit 14, and the prediction tap selection unit 15, processing according to a mode determined by the positional relationship between pixels in the input image and pixels in the output image is performed. First, the mode will be described with reference to FIG. 3 and FIG. An example of the positional relationship between the pixels in the input image and the pixels in the output image will be described with reference to FIG. In FIG. 3, the horizontal direction indicates the time direction, and the vertical direction indicates the vertical direction in the image. Therefore, the arrangement of pixels in the vertical direction represents a field. A black circle indicates a pixel in the input image, and a white circle indicates a pixel in the output image. FIG. 4 shows that there are a plurality of types of positional relationships between the pixels in the input image and the pixels in the output image.
[0018]
Such a positional relationship is shown in more detail in FIG. Here, one output pixel for each mode is shown with light ink as a representative example. When the pixel in the output image is on the field in the input image, the mode in which the pixel in the output image is at the same position as the pixel in the input image (mode 0), and the pixel in the output image is input There is a mode (mode 3) that exists between pixels in the image. In addition, when the pixel in the output image is on a field generated between the fields in the input image, the pixel in the output image is on the field in the input image that is positioned immediately before the field. And a mode (mode 1) in which there is a pixel whose vertical position coincides with that of the input image, and a position in the vertical direction of the pixel in the output image on the field in the input image located immediately after the field in time. There is a mode (mode 2) in which there is a matching pixel.
[0019]
When the pixel in the output image is on the field in the input image (mode 0 and mode 3), the motion compensation unit 12 is on the field in which the pixel in the output image is generated between the fields in the input image. Different processing is performed in some cases (mode 1 and mode 2). Such processing will be described with reference to FIGS. FIG. 5 shows an example of processing in mode 0 and mode 3. Here, the vertical direction indicates time, and the horizontal direction indicates a horizontal position in each field. Further, a case where the field to be generated is at the same time position as field (N) which is a field at time N will be described as an example. In this case, the position of the pixel in the input image indicated by a square matches the position of the target pixel (indicated by two diagonal lines intersecting each other).
[0020]
In FIG. 5, a motion vector estimated between field (N) and field (N + 2) that is a field at time N + 2 is denoted by me_x. In this case, as motion compensation, a process of drawing the field (N−1) and the field (N + 1) by me_x / 2 and −me_x / 2 in the horizontal direction is performed. Thereby, in the field (N−1) and the field (N + 1), it is possible to obtain an image in which the movement in the horizontal direction is apparently almost canceled.
[0021]
FIG. 6 shows an example of processing in mode 1 and mode 2. Here, the vertical direction indicates time, and the horizontal direction indicates a horizontal position in each field. Further, a case where the field to be generated is positioned between field (N) that is a field at time N and field (N + 1) that is a field at time N + 1 will be described as an example. In this case, the pixel in the input image indicated by a square is different from the position of the pixel of interest (indicated by two diagonal lines intersecting each other).
[0022]
Also in FIG. 6, similarly to FIG. 5, a motion vector estimated between field (N) and field (N + 2) that is a field at time N + 2 is denoted by me_x. In this case, as motion compensation, a process of drawing the field (N−1), field (N), and field (N + 1) in the horizontal direction by 3 × me_x / 4, me_x / 4, and −me_x / 4, respectively. Done. By such processing, it is possible to obtain an image in which the horizontal movement is apparently almost canceled in the field (N−1) and the field (N + 1).
[0023]
Next, processing by the motion class determination unit 13 will be described. The motion class determination unit 13 is supplied with the motion vector and information indicating the reliability of the motion vector from the motion estimation unit 11 as described above. Based on these, the motion class is determined as follows.
[0024]
Motion class 0: motion vector is valid and motion vector value is 0
Motion class 1: The motion vector is valid and the absolute value of the motion vector value is 6 or less.
Motion class 2: The motion vector is valid and the absolute value of the motion vector value is 7 or more.
Motion class 3: motion vector is invalid (at this time, the motion vector value is 0). Here, when it is determined that the reliability of the motion vector is low (as described above, the minimum value of the evaluation value in block matching is The motion vector is invalidated (when it exceeds a predetermined value), and the motion vector is validated otherwise. Further, values such as 6, 7 and the like, which are reference values when determining the motion class 1 and the motion class 2, are examples, and are not limited thereto. In general, an appropriate value may be referred to in consideration of the size of the search range (for example, ± 16 pixels in the horizontal direction), the nature of the input image, and the like. The motion class 3 is for avoiding inappropriate motion compensation based on a motion vector with low reliability.
[0025]
Next, operations of the class tap selection unit 14 and the prediction tap selection unit 15 will be described. The class tap selection unit 14 and the prediction tap selection unit 15 extract a pixel at a predetermined position as a class tap and a prediction tap according to the mode and the motion class. Examples of tap structures corresponding to

modes

0, 1, 2, and 3 are shown in FIGS. 7, 8, 9, and 10. FIG. 7 to 10, pixels extracted as class taps or prediction taps for motion classes 0 and 1 are indicated by black circles, and pixels extracted as class taps or prediction taps for

motion classes

2 and 3 are indicated by white circles. It was. All pixels other than the pixels extracted as class taps or prediction taps are indicated by dotted circles.
[0026]
An example of the tap structure in mode 0 is shown in FIG. As shown in FIG. 7A, the class tap structures match between motion classes 0 and 1 and

motion classes

2 and 3. That is, in any motion class, a total of seven pixels, five from the current field and two from the field one field after the current field, are extracted as class taps. Further, as shown in FIG. 7B, the prediction tap structures match between motion classes 0 and 1 and

motion classes

2 and 3. That is, a total of 13 pixels are extracted as prediction taps, 9 from the current field and 2 from each field 1 field before and 1 field after the current field.
[0027]
An example of the tap structure in mode 1 is shown in FIG. In the case of motion classes 0 and 1, pixels at the positions indicated by black circles in FIG. 8A (that is, four pixels from the current field, one and three pixels from each field before and one field after the current field, respectively) A total of 8) are extracted as class taps. In the case of

motion classes

2 and 3, pixels at positions indicated by white circles in FIG. 8A (four from the current field and five from each field one field after the current field) are class taps. Extracted. On the other hand, in the case of motion classes 0 and 1, pixels at positions indicated by black circles in FIG. 8B (14 pixels in total, 8 from the current field, 3 from each field before and 1 field after the current field). Extracted as a prediction tap. In the case of

motion classes

2 and 3, pixels at positions indicated by white circles in FIG. 8B (13 pixels in total, 8 from the current field and 5 from each field one field after the current field) are used as prediction taps. Extracted.
[0028]
An example of the tap structure in mode 2 is shown in FIG. As shown in FIG. 9A, the class tap structures match between motion classes 0 and 1 and

motion classes

2 and 3. That is, in any motion class, a total of seven pixels, five from the current field and two from the field one field after the current field, are extracted as class taps. On the other hand, in the case of motion classes 0 and 1, pixels at the positions indicated by black circles in FIG. 9B (13 pixels in total, 9 from the current field and 2 from each field one field before and one field after the current field). Extracted as a prediction tap. In the case of

motion classes

2 and 3, pixels at positions indicated by white circles in FIG. 8B (9 from the current field and 6 from the field one field after the current field) are extracted as prediction taps. Is done.
[0029]
An example of the tap structure in mode 3 is shown in FIG. In the case of motion classes 0 and 1, a total of eight pixels at the positions indicated by black circles in FIG. 10A (that is, two from the current field and three from each field one field before and one field after the current field) ) Is extracted as a class tap. In the case of

motion classes

2 and 3, pixels at positions indicated by white circles in FIG. 8A (6 pixels from the current field, 1 from each field before and 1 field after the current field, 8 in total) Are extracted as class taps. On the other hand, as shown in FIG. 10B, motion taps 0 and 1 and

motion classes

2 and 3 have the same predicted tap structure. That is, in any motion class, a total of 14 pixels are extracted as class taps, 8 from the current field and 3 from each field before and after the current field.
[0030]
Next, calculation of a prediction coefficient used when learning, that is, mapping described above will be described. An image having the same signal format as the input image in FIG. 1 is supplied to the motion estimation unit 21 and the motion compensation unit 22 as a student image. The motion estimation unit 21 performs the same processing as the motion estimation unit 11 in FIG. That is, the motion estimation unit 21 estimates the motion vector of the target pixel in the student image and supplies the estimated motion vector to the motion compensation unit 22. In addition, the motion estimation unit 21 supplies information indicating the reliability together with the motion vector to the motion class determination unit 23. On the other hand, the motion compensation unit 22 performs a motion compensation process similar to that of the motion compensation unit 12 in FIG. An image generated as a result of this motion compensation processing is supplied to the class tap selection unit 24 and the prediction tap selection unit 25.
[0031]
The motion class determination unit 23 performs the same process as the motion class determination unit 13 in FIG. 1 to determine a motion class, and information indicating the determined motion class is transmitted to the class tap selection unit 24, the prediction tap selection unit 25, and the class. It supplies to the determination part 27. The class tap selection unit 24 extracts pixels at the same position as the class tap selection unit 14 in FIG. 1 as class taps, and supplies the extracted class tap data to the space class determination unit 26. The space class determination unit 26 determines a space class by performing the same processing as the space class determination unit 16 in FIG. 1 based on the supplied data, and sends information indicating the determined space class to the class determination unit 27. Supply.
[0032]
The class determination unit 27 performs the same processing as the class determination unit 17 in FIG. 1 to determine a final class, and supplies information indicating the final class to the matrix selection unit 28. The matrix selection unit 28 selects a matrix corresponding to the final class, and supplies data related to the selected matrix to the matrix addition unit 29.
[0033]
On the other hand, the prediction tap selection unit 25 extracts pixels at the same position as the prediction tap selection unit 15 in FIG. 1 as prediction taps, and supplies the extracted prediction tap data to the matrix addition unit 29. The matrix adder 29 is further supplied with an image having the same signal format as the output image in FIG. 1 as a teacher image. The matrix adding unit 29 generates normal equation data by performing a process of adding the prediction tap data and the calculation result based on the teacher image to the data supplied from the matrix selecting unit 28. The data of the normal equation is supplied from the matrix adding unit 29 to the coefficient determining unit 30. The coefficient determination unit 30 calculates a prediction coefficient by performing an operation for solving a normal equation. The calculated prediction coefficient is temporarily stored in a memory (not shown), for example, and is used in the arithmetic processing described above with reference to FIG. 1 by a method such as loading into the memory in the prediction coefficient selection unit in FIG. Is possible.
[0034]
Next, the calculation for calculating the prediction coefficient will be described. In the above equation (1), before learning, the prediction coefficient w₁, ..., w_nIs an undetermined coefficient. Learning is performed by inputting a plurality of teacher images for each class. When the number of types of teacher images is expressed as m, the following equation (2) is set from equation (1).
[0035]
y_k= W₁X_k1+ W₂X_k2+ ... + w_nX_kn (2)
(K = 1, 2,..., M)
If m> n, prediction coefficient w₁, ..., w_nIs not uniquely determined, the element e of the error vector e_kIs defined by the following equation (3), and the prediction coefficient is determined so as to minimize the error vector e defined by equation (4). That is, the prediction coefficient is uniquely determined by a so-called least square method.
[0036]
e_k= Y_k-{W₁X_k1+ W₂X_k2+ ... + w_nX_kn} (3)
(K = 1, 2, ... m)
[0037]
[Expression 1]

[0038]
E in equation (4)²As a practical calculation method for obtaining the prediction coefficient that minimizes²Prediction coefficient w_i(i = 1, 2...) is partially differentiated (formula (5)), and each prediction coefficient w is set so that the partial differential value becomes 0 for each value of i._iShould be determined.
[0039]
[Expression 2]

[0040]
From equation (5), each prediction coefficient w_iA specific procedure for determining the above will be described. X as in equations (6) and (7)_ji, Y_i(5) can be written in the form of the determinant of equation (8).
[0041]
[Equation 3]

[0042]
[Expression 4]

[0043]
[Equation 5]

[0044]
Equation (8) is generally called a normal equation. The matrix addition unit 29 calculates parameters in the normal equation (8). The coefficient determination unit 30 solves the normal equation (8) according to a general matrix solution method such as a sweep-out method based on the calculated parameters, thereby obtaining the prediction coefficient w._i(I = 1, 2,..., N) is calculated.
[0045]
In the embodiment of the present invention described above, the motion vector obtained as a result of processing by the motion estimation unit 11 is also reflected in the class classification in the class tap selection, prediction tap selection, and class determination circuit 17. The method of reflecting the motion vector to the frequency conversion process is not limited to this. For example, a configuration that uses motion vectors only for motion compensation, a configuration that uses motion vectors for motion compensation and selection of class taps and / or prediction taps, and a configuration that uses motion vectors for motion compensation and class classification, etc. When the motion of the image is large, it is possible to improve the conversion performance to some extent by field frequency conversion.
[0046]
In general, it is considered that as the motion vector is reflected in the operation of more components, the conversion performance is improved when the motion of the input image is large, but the circuit configuration is increased. Therefore, when applying the present invention, a configuration suitable for conditions such as conversion performance, circuit scale, and cost required for the apparatus may be used.
[0047]
The present invention is not limited to the above-described embodiment of the present invention, and various modifications and applications can be made without departing from the gist of the present invention.
[0048]
【The invention's effect】
According to the present invention, field frequency conversion is performed by applying the class classification adaptive processing to an image that has been motion compensated based on the amount of motion in the input image. For this reason, even when the amount of motion is large, the accuracy of the conversion process can be improved.
[0049]
In particular, the accuracy of the conversion process can be further improved by performing the process according to the amount of motion when extracting the image data used for the calculation in the class classification adaptive process.
[0050]
Also,Around the pixel of interestFor image data levelSpace class with clothIn addition to the amount of movementBy movement classTheForm a synthesized classThen, the accuracy of the conversion process can be further improved.
[Brief description of the drawings]
FIG. 1 is a block diagram for illustrating a configuration related to a field frequency conversion process in an embodiment of the present invention.
FIG. 2 is a schematic diagram for explaining block matching.
FIG. 3 is a schematic diagram for explaining a positional relationship between a pixel in an input image and a pixel in an output image.
FIG. 4 is a schematic diagram for explaining modes.
FIG. 5 is a schematic diagram for explaining motion compensation processing in mode 0 and mode 3;
FIG. 6 is a schematic diagram for explaining motion compensation processing in mode 1 and mode 2;
FIG. 7 is a schematic diagram for illustrating a tap structure in mode 0;
8 is a schematic diagram for illustrating a tap structure in mode 1. FIG.
9 is a schematic diagram for illustrating a tap structure in mode 2. FIG.
10 is a schematic diagram for illustrating a tap structure in mode 3. FIG.
FIG. 11 is a block diagram for explaining processing related to learning in an embodiment of the present invention;
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 11 ... Motion estimation part, 12 ... Motion compensation part, 21 ... Motion estimation part, 22 ... Motion compensation part

Claims

In a field frequency conversion device for forming an output image having a field number larger than the number of fields of the input image by generating a new field between the fields of the input image,
A motion amount estimating means for detecting a motion vector of a target pixel which is a target of processing of an input image;
Motion compensation means for performing motion compensation processing on the input image according to the motion vector ;
Motion class determining means for determining a motion class according to the magnitude of the value of the motion vector ;
First image cutout means for cutting out a plurality of image data around the target pixel from the output of the motion compensation means;
A space class determining means for determining a spatial class based on a level distribution of a plurality of image data detected, the detected level distribution to be cut by the first image cutout unit,
Class determining means for determining a class by combining the motion class from the motion class determining means and the space class from the space class determining means;
A second image cutout unit that cuts out a plurality of image data around the target pixel from the output of the motion compensation unit;
Predicted corresponding to each of the classes , storing prediction coefficients for estimating the output image signal, and outputting the stored prediction coefficients corresponding to the class from the class determining means Coefficient storage means;
Said plurality of image data cut out by the second image cutout unit, have rows product-sum operation of the prediction coefficients supplied from the coefficient storage means, processing means for generating a predicted value of the image data of the output image It has a door,
The prediction coefficient is obtained by learning in advance for each class so as to minimize an error between the generated value and the true value of the pixel when the pixel of the output image signal is generated by the product-sum operation. And stored in the coefficient storage means .

In claim 1,
The first image cutout means includes:
A field frequency conversion device, wherein a plurality of image data to be cut out is set according to the motion class .

In claim 1,
The second image cutout means includes:
A field frequency conversion device, wherein a plurality of image data to be cut out is set according to the motion class .

In claim 1,
The field frequency conversion device, wherein the motion amount estimation means outputs information on reliability of the detected motion vector together with the motion vector.

In a field frequency conversion method for forming an output image having a field number larger than the number of fields of the input image by generating a new field between the fields of the input image,
A motion amount estimation step of detecting a motion vector of a target pixel that is a target of processing of an input image;
A motion compensation step for performing motion compensation processing on the input image by the motion vector ;
A motion class determining step for determining a motion class according to the magnitude of the value of the motion vector ;
A first image cutout step of cutting out a plurality of image data around the pixel of interest from the output of the motion compensation step;
A space class determining step of determining a spatial class plurality of detecting the level distribution of the image data, based on the detected level distribution to be cut by the first image extraction step,
A class determining step for determining a class by combining the motion class from the motion class determining step and the space class from the space class determining step;
A second image cutout step of cutting out a plurality of image data around the pixel of interest from the output of the motion compensation step;
Predicted corresponding to each of the classes , storing prediction coefficients for estimating the output image signal, and outputting the stored prediction coefficients corresponding to the class from the class determining step. A coefficient storage step;
A plurality of image data cut out by the second image extraction step, have rows product-sum operation of the prediction coefficients supplied from the coefficient storage step, the arithmetic processing step of generating a predicted value of the image data of the output image It has a door,
The prediction coefficient is obtained by learning in advance for each class so as to minimize an error between the generated value and the true value of the pixel when the pixel of the output image signal is generated by the product-sum operation. And stored in the coefficient storing step .