JP3871350B2

JP3871350B2 - Image conversion apparatus and method capable of resolution compensation

Info

Publication number: JP3871350B2
Application number: JP13805594A
Authority: JP
Inventors: 哲二郎近藤
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1994-05-28
Filing date: 1994-05-28
Publication date: 2007-01-24
Anticipated expiration: 2022-01-24
Also published as: JPH07322215A

Description

【０００１】
【産業上の利用分野】
この発明は、ディジタル画像信号の解像度をより高いものとすることができる解像度補償可能な画像変換装置および方法に関する。
【０００２】
【従来の技術】
従来、標準解像度のビデオ信号（ＳＤ信号）を高解像度のビデオ信号（ＨＤ信号）へ変換（所謂、アップコンバージョン）を行なう場合、補間フィルタによって、水平および垂直方向の画素数が２倍としていた。しかしながら、単に補間によっては、入力信号以上の解像度をつくり出すことができない。
【０００３】
この問題を解決するために、ＨＤ信号の注目画素を周辺のＳＤ画素を使用してクラス分けし、予め学習によって求めておいた、そのクラスの予測係数と複数のＳＤ画素の線形１次結合によって、注目ＨＤ画素の値を形成する解像度補償装置が提案されている。この処理は、それ自身有効であるが、時間領域における処理であるため、これらの領域において、特徴が良く表現できる信号に対して、高精度の処理が可能である。逆の場合には、精度が不充分な問題等が生じる。
【０００４】
時間領域の処理および周波数領域の処理の問題について、一般的に述べると、ディジタル画像信号、ディジタルオーディオ信号等の信号処理を行なう時に、時間領域あるいは周波数領域のいずれかで信号処理を行なうのが普通であった。周波数領域の処理は、信号の定常特性を良く表現できるが、過渡特性の表現には不向きであった。一方、時間領域の処理は、過渡特性を表現するのに適しているが、定常特性を表現するには不向きであった。ここで、定常特性とは、安定した繰り返しの変化を意味し、過渡特性とは、孤立した１回限りの変化を意味する。
【０００５】
一例として、図１３は、時間領域処理の場合を示す。図１３Ａに示すように、過渡特性は、時間軸に対して、その変化が激しい波形（インパルス状の波形）となり、これは、例えば数個のサンプル程度を使用することによって、充分処理することができる。波形中のドットは、サンプリング位置を示し、ディジタル信号の場合は、各サンプリング位置のレベルと対応するサンプリング値を有する離散的信号系列である。但し、図においては、以下も同様であるが、アナログ信号波形でもって表すことにする。一方、定常特性は、時間軸上で図１３Ｂに示すような変化がゆるやかな波形（フラットな波形）となり、これは、数個程度のサンプルを使用しても、波形の特徴が分からず、充分な処理ができない。
【０００６】
次に、周波数領域で考えると、定常特性は、含まれる周波数成分が単一あるいは少ないので、図１４Ａに示すようなインパルス状の波形となる。一方、過渡特性は、図１４Ｂに示すようなフラットな波形となる。上述と同様に、インパルス状の波形の方が信号の特徴をとらえるのに適している。
【０００７】
一般的な信号波形は、時間軸に対しては、図１５に示すように、定常特性（フラット）の部分ＦＬ１、ＦＬ２、ＦＬ３、・・・と過渡特性（インパルス）の部分ＩＭ１、ＩＭ２、・・・とが混在したものである。従って、時間領域処理と周波数領域処理との一方のみを行なうことによっては、信号の特徴を正しく反映した処理を行なうことが難しい。そのために、同一の信号に対して、時間領域処理と周波数領域処理とを行なう必要が生じ、処理時間が長くなったり、処理のためのハードウエアの規模が大きくなる問題があった。
【０００８】
従って、この発明の目的は、解像度補償の処理を行なう時に、ディジタル画像信号の定常特性の部分に対しては、周波数領域で処理し、その過渡特性の部分に対しては、時間領域で処理することができ、精度の向上、処理時間の短縮化、処理のためのハードウエアの規模の減少等が可能な解像度補償可能な画像変換装置および変換方法を提供することにある。
【０００９】
【課題を解決するための手段】
請求項１の発明は、第１のディジタル画像信号を周波数領域において分析する分析手段と、
分析手段の出力に基づいて、第１のディジタル画像信号を分類する分類手段と、
第１の解像度に比べてより高い第２の解像度を有する第２のディジタル画像信号を形成するために、分類手段によって分類されたそれぞれの信号を、第１のディジタル画像信号に基づく特性に応じて、適応的に処理する第１及び第２の処理手段と、
第１及び第２の処理手段の出力を合成する合成手段とからなることを特徴とする解像度補償可能な画像変換装置である。
【００１０】
請求項７の発明は、第１の解像度を有する第１のディジタル画像信号を周波数分析する分析手段と、
分析手段の出力から周波数領域でインパルス状成分の信号とフラット成分の信号とを分離する分離手段と、
分離手段からインパルス状成分の信号が供給され、第１の解像度に比べてより高い第２の解像度を有する第２のディジタル画像信号を形成するために、そのインパルス状成分の信号を周波数領域で処理する第１の処理手段と、
第１の処理手段からの出力を時間領域信号に変換する第１の変換手段と、
分離手段からフラット成分の信号が供給され、フラット成分の信号を時間領域信号に変換する第２の変換手段と、
第２の変換手段から時間領域信号が供給され、第１の解像度に比べてより高い第２の解像度を有する第２のディジタル画像信号を形成するために、時間領域信号を時間領域で処理する第２の処理手段と、
第１の変換手段の出力と第２の処理手段の出力を合成する合成手段とからなることを特徴とする解像度補償可能な画像変換装置である。
【００１１】
解像度を補償する時に、入力ＳＤ信号を周波数領域でインパルス状成分と、フラット成分に分けられる。インパルス状成分は、周波数領域において、解像度補償の処理を行なう処理回路に供給され、フラット成分は、時間領域において、解像度補償の処理を行なう処理回路に供給される。そして、各処理回路で処理された結果の信号が時間領域上で合成され、解像度補償がなされたビデオ信号（ＨＤ信号）が得られる。
【００１２】
【実施例】
以下、この発明によるディジタルビデオ信号の解像度補償装置の一実施例について説明する。解像度補償とは、図２Ａにおいて、もともと２０ａの周波数特性で示すような広帯域のビデオ信号がフィルタリング処理等によって、２０ｂの周波数特性で示すように、帯域が狭くなったことを補償し、すなわち、斜線部分の成分を作り出すことによって、図２Ｂに示す広帯域のビデオ信号へ変換することである。
【００１３】
この一実施例の全体的構成を示す図１において、１で示す入力端子に対して標準解像度のディジタルビデオ信号（ＳＤビデオ信号と称する）が供給される。また、高解像度のディジタルビデオ信号をＨＤビデオ信号と称する。入力ＳＤビデオ信号の例は、ＳＤＶＴＲの再生信号、放送信号等である。入力ＳＤビデオ信号がブロック化回路２に供給され、テレビジョンラスターの順序のビデオ信号が例えば（８×８）のブロック構造の信号に走査変換される。
【００１４】
ブロック化回路２に対して、ＤＣＴ（Discrete Cosine Transform)回路３が接続され、ＤＣＴ回路３からは、一つのブロックと対応して、１個の直流成分の係数データＤＣと６３個の交流成分の係数データＡＣ１、ＡＣ２、・・・、ＡＣ６３とが発生する。一例として、ＤＣから開始して、より高次のＡＣ係数が順次出力されるジグザグ走査でもって、係数データが出力される。ＤＣＴは、入力ビデオ信号の周波数解析の一つの手段であって、ＦＦＴ、アダマール変換等を使用しても良い。
【００１５】
ＤＣＴ回路３からの係数データが係数解析回路４を介して分類回路５に供給される。これらの係数解析回路４および分類回路５は、周波数領域へ変換されたディジタルビデオ信号の定常成分と過渡成分とを分離するために、設けられている。分類回路５からは、周波数領域でのフラットな成分（すなわち、過渡成分）６ａと、インパルス状の成分（すなわち、定常成分）６ｂとが分離して現れる。
【００１６】
理解を容易とするために、係数データの値の一例を（ＤＣ＝５０、ＡＣ１＝４８、ＡＣ２＝４６、ＡＣ３＝４４、ＡＣ４＝４２、ＡＣ５＝６０、・・・・）と仮定する。係数解析回路４は、この係数データの解析を行い、ＡＣ５がインパルス状のものと判断する。つまり、ＡＣ５は、ＡＣ１、ＡＣ２、ＡＣ３、ＡＣ４の変化の傾向から４０となるはずである。それが６０の値となっているので、これは、２０の値、突出している。分類回路５は、周波数領域のフラットな成分（過渡成分であり、上述の例では、ＤＣ＝５０、ＡＣ１＝４８、ＡＣ２＝４６、ＡＣ３＝４４、ＡＣ４＝４２、ＡＣ５＝４０、・・・・）６ａと、周波数領域のインパルス状の成分（定常成分であり、上述の例では、ＤＣ＝０、ＡＣ１＝０、ＡＣ２＝０、ＡＣ３＝０、ＡＣ４＝０、ＡＣ５＝２０、・・・・）６ｂとを分離して出力する。
【００１７】
分類回路５からのフラット成分６ａが逆ＤＣＴ回路７に供給され、時間領域の信号に戻され、ブロック分解回路８に供給される。ブロック分解回路８からは、テレビジョンのラスター走査の順に戻されたディジタルビデオ信号が得られる。このディジタルビデオ信号が第２の処理回路としてのクラス分類適応処理回路９に供給される。この回路９は、後述のように、時間領域において解像度を高くするための処理回路である。フラット成分６ａは、時間領域の処理に適しており、回路９によって、解像度の補償を良好になしうる。
【００１８】
分類回路５からのインパルス状成分６ｂがゲイン変換回路１０に供給される。ゲイン変換回路１０に対しては、ブロック化回路２の出力信号がクラス分類のために供給される。ゲイン変換回路１０には、後述のように学習によって予め獲得されたゲイン変換比情報が格納されたメモリが設けられている。このように、係数データのゲインを変換比情報に従って調整することによって、周波数領域で高域成分が増強される。ゲイン変換回路１０の出力信号が逆ＤＣＴ回路１１に供給される。逆ＤＣＴ回路１１によって、時間領域に戻された信号がブロック分解回路１２に供給され、テレビジョンラスター走査の順のデータへ変換される。
【００１９】
ブロック分解回路１２の出力信号が位相補償回路１３を介して合成回路１４に供給され、合成回路１４にて、上述のクラス分類適応処理回路９の出力信号と合成される。この合成は、単純多重の処理である。そして、合成回路１４から出力端子１５には、解像度が補償されたディジタルビデオ信号、すなわち、ＨＤビデオ信号が得られる。
【００２０】
クラス分類適応処理回路９の一例を図３に示す。２１で示す入力端子に対しては、ブロック分解回路８からのディジタルビデオ信号が供給される。このディジタルビデオ信号は、ＳＤビデオ信号のフラット成分（過渡成分）であり、時間領域でインパルス状となる信号である。このディジタルビデオ信号が同時化回路２２に供給される。同時化回路２２の出力データがクラス分類回路２３に供給される。クラス分類回路２３の出力がマッピング表Ｍ１〜Ｍ４がそれぞれ蓄えられたメモリ２４ａ〜２４ｄにアドレス信号として供給される。
【００２１】
図４は、ＳＤ画像およびＨＤ画像の関係を部分的に示す。図４において、○の画素データがＳＤ画像のもので、×の画素データがＨＤ画像のものである。例えば１２個のＳＤ画像の画素データａ〜ｌから４個のＨＤ画像の画素データｙ１〜ｙ４が生成される。メモリ２４ａのマッピング表Ｍ１は、画素データｙ１を発生するためのもので、メモリ２４ｂ、２４ｃ、２４ｄのマッピング表Ｍ２、Ｍ３、Ｍ４は、画素データｙ２、ｙ３、ｙ４をそれぞれ発生するためのものである。
【００２２】
メモリ２４ａ〜２４ｄの読み出し出力がセレクタ２５に供給される。セレクタ２５は、セレクト信号発生回路２６の出力によって制御される。セレクト信号発生回路２６には、ＨＤ画像のサンプルクロックが入力端子２７から供給される。セレクタ２５によって、４個の画素データｙ１〜ｙ４が順番に選択され、これらの画素データが走査変換回路２８に供給される。走査変換回路２８は、ＨＤ画像の画素データをラスター走査の順に出力端子２９に発生する。出力画像の画素数は、入力ＳＤビデオ信号の画素数の４倍である。
【００２３】
メモリ２４ａ〜２４ｄに格納されるマッピング表Ｍ１〜Ｍ４は、予め学習によって生成される。マッピング表Ｍ１〜Ｍ４の生成のための構成の一例を図５に示す。図５中で、３１で示す入力端子にディジタルのＨＤビデオ信号が供給される。このＨＤビデオ信号は、マッピング表の生成を考慮した標準的な信号であることが好ましい。実際には、標準的な画像をＨＤビデオカメラにより撮像することによって、あるいは撮像信号をＨＤＶＴＲに記録することによって、ＨＤビデオ信号を得ることができる。
【００２４】
このＨＤビデオ信号が同時化回路３２に供給される。この同時化回路３２は、図４に示す位置関係を有する画素データａ〜ｌとｙ₁〜ｙ₄とを同時に出力する。画素データａ〜ｌがクラス分類回路３３に供給される。クラス分類回路３３は、階調、パターン等でＨＤ画素データｙ₁〜ｙ₄のクラス分けを行なう。このクラス分類回路３３の出力がマッピング表生成回路３４ａ〜３４ｄに対して共通に供給される。
【００２５】
同時化回路３２からの画素データｙ₁〜ｙ₄がマッピング表生成回路３４ａ〜３４ｄに対して供給される。マッピング表生成回路３４ａ〜３４ｄは、同一の構成を有している。マッピング表としては、２種類可能である。その一つは、ＨＤ画素の値ｙ₁、ｙ₂、ｙ₃またはｙ₄をＳＤ画素の値ａ〜ｌと係数ｗ₁〜ｗ₁₂の線形結合で予測するためのもので、この場合には、クラス毎に係数ｗ₁〜ｗ₁₂が定まる。他のものは、クラス毎に予測される、ＨＤ画素の値そのものである。
【００２６】
図５中のマッピング表作成回路３４ａ〜３４ｄにそれぞれ設けられたメモリには、ＨＤビデオ信号とＳＤビデオ信号との間の相関を示すマッピング表が蓄えられる。言い換えれば、ＳＤビデオ信号の複数のデータが与えられた時に、この複数のデータのクラスと、平均的に対応が取れたＨＤビデオ信号の画素データを出力するマッピング表が形成できる。
【００２７】
クラス分類回路３３は、図３のクラス分類回路２３と同様に、注目画素データをクラス分類し、クラス情報を発生する。クラス分類としては、階調によるクラス分類、パターンによるクラス分類等を使用できる。階調を使用する時には、画素データが８ビットであると、クラスの個数が極めて多くなるので、各画素のビット数をＡＤＲＣ等の高能率符号化で減少させることが好ましい。パターンを使用する時には、４画素で構成される複数のパターン（例えば平坦、右上に値が上昇、右下に値が減少、等）を用意し、同時化回路３２の出力データを複数のパターンのいずれかにクラス分けする。
【００２８】
ＨＤ画素データｙ₁を求めるマッピング表作成回路３４ａを例にとると、クラス分類回路３３からのクラス情報がアドレスとして供給されるメモリが設けられる。トレーニング（学習）時では、原ＨＤビデオ信号を間引き処理することによって、ＳＤビデオ信号を形成する。水平方向の間引き処理（サブサンプリング）および垂直方向の間引き処理（サブライン）がなされる。１フレーム以上のＨＤビデオ信号例えば静止画像が使用される。メモリには、クラス情報と対応する各アドレスに対して、画素データａ〜ｌおよびｙ₁のサンプル値が書込まれる。例えばメモリのアドレスＡＤ０には、（ａ₁₀、ａ₂₀、・・・、ａ_n0）（ｂ₁₀、ｂ₂₀、・・・、ｂ_n0）・・・・（ｌ₁₀、ｌ₂₀、・・・、ｌ_n0）（ｙ₁₀、ｙ₂₀、・・・、ｙ_n0）が蓄えられる。
【００２９】
このように蓄えられた学習データがメモリから読出され、ＳＤ画素の値ａ〜ｌと係数ｗ₁〜ｗ₁₂の線形１次結合で得られるＨＤ画素（ｙ₁に対応する）予測値と真値との誤差を最小とする係数が最小二乗法によって求められる。一つのメモリのアドレスに蓄えられた学習データに注目すると、このアドレスに関しては、下記の連立方程式が成り立つ。
【００３０】
ｙ₁₀＝ｗ₁ａ₁₀＋ｗ₂ｂ₁₀＋ｗ₃ｃ₁₀＋・・・・・・＋ｗ₁₂ｌ₁₀
ｙ₂₀＝ｗ₁ａ₂₀＋ｗ₂ｂ₂₀＋ｗ₃ｃ₂₀＋・・・・・・＋ｗ₁₂ｌ₂₀
ｙ₃₀＝ｗ₁ａ₃₀＋ｗ₂ｂ₃₀＋ｗ₃ｃ₃₀＋・・・・・・＋ｗ₁₂ｌ₃₀
・
・
・
ｙ_n0＝ｗ₁ａ_n0＋ｗ₂ｂ_n0＋ｗ₃ｃ_n0＋・・・・・・＋ｗ₁₂ｌ_n0
【００３１】
ここで、ｙ₁₀〜ｙ_n0、ａ₁₀〜ａ_n0、ｂ₁₀〜ｂ_n0、ｃ₁₀〜ｃ_n0、・・・・、ｌ₁₀〜ｌ_n0が既知であるので、ｙ₁₀〜ｙ_n0（真値）に対する予測値の誤差の二乗を最小とするような係数ｗ₁〜ｗ₁₂を求めることができる。他のクラス（アドレス）についても同様に係数を決定することができる。このように決定された係数がメモリに格納され、マッピング表として使用される。
【００３２】
係数に限らず、クラス毎にＨＤビデオ信号のデータの値をトレーニングによって求め、メモリに格納しても良い。例えば図６は、そのための構成を示す。クラス分類回路３３からのクラス情報がアドレスとして供給されるデータメモリ４０および度数メモリ４１が設けられる。
【００３３】
度数メモリ４１の読出し出力が加算器４２に供給され、＋１され、加算器４２の出力がメモリ４１の同一アドレスに書込まれる。メモリ４０および４１は、初期状態として各アドレスの内容がゼロにクリアされる。
【００３４】
データメモリ４０から読出されたデータが乗算器４３に供給され、度数メモリ４１から読出された度数と乗算される。乗算器４３の出力が加算器４４に供給され、加算器４４にて入力データｙと加算される。加算器４４の出力が割算器４５に除数として供給される。この割算器４５の出力（商）がデータメモリ４０に入力データとされる。
【００３５】
上述の図６の構成において、あるアドレスが最初にアクセスされる時には、メモリ４０および４１の読出し出力が０であるため、データｙ₁₀がそのままメモリ４０に書込まれ、メモリ４１の対応するアドレスの値が１とされる。若し、その後で、このアドレスが再びアクセスされると、加算器４２の出力が２であり、加算器４４の出力が（ｙ₁₀＋ｙ₂₀）である。従って、割算器４５の出力が（ｙ₁₀＋ｙ₂₀）／２であり、これがメモリ４０に書込まれる。さらに、その後で、上述のアドレスがアクセスされると、同様の動作によって、メモリ４０のデータが（ｙ₁₀＋ｙ₂₀＋ｙ₃₀）／３に変更され、度数も３に更新される。
【００３６】
上述の動作を所定期間行なうことによって、メモリ４０には、クラス分類回路３３の出力によってクラスが指定されると、そのときのデータが出力されるようなマッピング表が蓄えられる。言い換えれば、入力ビデオ信号の複数の画素データが与えられた時に、それをクラス分類したものと平均的に対応がとれたデータを出力するマッピング表が形成できる。
【００３７】
クラス分類適応処理回路９についてより詳細に説明すると、クラス分類適応処理回路９は、上述のように、線形１次結合の係数をトレーニングによって、予め決定する。このトレーニング時には、図７の構成が使用される。図７において、５１は、入力端子で、標準的なＨＤ信号の静止画像を多数枚入力され、垂直間引きフィルタ５２と学習部５４へ供給される。垂直間引きフィルタ５２は、ＨＤ画像を垂直方向に１／２に間引きし。垂直間引きフィルタ５２と接続されるて水平間引きフィルタ５３で水平方向に１／２に間引きを行ない、ＳＤ信号と同等の画素の静止画像を学習部５４に供給する。メモリ５５は、学習部５４で作成されたクラスコードと学習結果を記憶する。
【００３８】
この例では、図８に示すように、ＨＤ画素とＳＤ画素の位置関係が規定される。図８に示すように、ＳＤ画素（３×３）ブロックを用いる場合、ＳＤ画素ａ〜ｉとＨＤ画素Ａ，Ｂ，Ｃ，Ｄが一組の学習データとなる。１フレームに関して複数組の学習データが存在し、且つ、フレーム数を増加させることにより非常に多数の組の学習データを利用できる。
【００３９】
ここで図９は、学習部５４において、線形１次結合の係数を決定する場合に、その処理をソフトウェアで行なう時の動作を示すフローチャートである。ステップ６１から学習部の制御が開始され、ステップ６２の対応データブロック化では、ＨＤ信号とＳＤ信号が供給され、図８に示すような配列関係にあるＨＤ画素およびＳＤ画素を取り出す処理を行なう。ステップ６３のデータ終了では、入力された全データ例えば１フレームのデータの処理が終了していれば、ステップ６６の予測係数決定へ、終了していなければ、ステップ６４のクラス決定へ制御が移る。
【００４０】
ステップ６４のクラス決定では、ＳＤ信号の信号パターンからクラスを決める。この制御では、ビット数削減のために、ＡＤＲＣを用いることができる。ステップ６５の正規方程式加算では、後述するような方程式を作成する。
【００４１】
ステップ６３のデータ終了から全データの処理が終了後、制御がステップ６６に移り、ステップ６６の予測係数決定では、後述する方程式を行列解法を用いて解いて、予測係数を決める。ステップ６７の予測係数ストアで、予測係数をメモリにストアし、ステップ６８で学習部の制御が終了する。メモリ内には、ＳＤ信号で決定されるクラスをアドレスとして、そのクラスの予測係数が記憶される。クラスおよび予測係数が上述したマッピング表と対応する。
【００４２】
図８中のＨＤ画素とＳＤ画素の関係を規定するための係数を求める処理をより詳細に説明する。一般的にＳＤ画素レベルをｘ₁〜ｘ_nとし、ＨＤ画素レベルをｙとしたとき、クラス毎に係数ｗ₁〜ｗ_nによるｎタップの線形推定式
ｙ´＝ｗ₁ｘ₁＋ｗ₂ｘ₂＋‥‥＋ｗ_nｘ_n （１）
を設定する。学習前はｗ_iが未定係数である。
【００４３】
上述のように、学習はクラス毎に複数のＨＤデータおよびＳＤデータに対して行なう。データ数がｍの場合、式１に従って、
ｙ_j´＝ｗ₁ｘ_{j 1}＋ｗ₂ｘ₂２＋‥‥＋ｗ_nｘ_jn （２）
（但し、ｊ＝１，２，‥‥ｍ）
【００４４】
ｍ＞ｎの場合、ｗ₁〜ｗ_nは一意には決まらないので、誤差ベクトルｅの要素を
ｅ_j＝ｙ_j−（ｗ₁ｘ_j1＋ｗ₂ｘ_j2＋‥‥＋ｗ_nｘ_jn）（３）
（但し、ｊ＝１，２，‥‥ｍ）
と定義して、次の式４を最小にする係数を求める。
【００４５】
【数１】

【００４６】
いわゆる最小自乗法による解法である。ここで式３のｗ_iによる偏微分係数を求める。
【００４７】
【数２】

【００４８】
式６を０にするように各ｗ_iを決めればよいから、
【００４９】
【数３】

【００５０】
として、行列を用いると
【００５１】
【数４】

【００５２】
となり、掃き出し法等の一般的な行列解法を用いて、この式８を解けば予測係数ｗ_iが求まり、クラスコードをアドレスとして、この予測係数ｗ_iをメモリに格納しておく。
【００５３】
以上のように学習部が実データであるＨＤ信号を用いて予測係数ｗ_iを獲得することができ、これをメモリに格納しておく。そして、任意の入力されたＳＤ信号からクラス情報を形成し、クラス情報と対応する予測係数をメモリから読出し、注目画素の周辺のＳＤ画素の値と予測係数の線形１次結合によって、注目画素の値を形成することができ、任意の入力ＳＤ画像に対して出力ＨＤ画像を生成することができる。
【００５４】
学習部５４が予測係数ではなく、クラス毎の代表値を決定する時には、図１０のフローチャートで示すような処理がなされる。開始のステップ７１、学習データ形成のステップ７２およびデータ終了のステップ７３およびクラス決定のステップ７４は、上述した図９中のステップ６１、６２、６３および６４と同様のものである。
【００５５】
正規化のステップ７５では、画素の値の正規化がなされる。すなわち、ＨＤ画素の値（入力値）をｙとすると、（ｙ−base）／ＤＲの演算により入力データが正規化される。ここで、ＤＲは、図８に示す画素配列において、ａ〜ｉを１ブロックとする時に、この１ブロック内の画素の最大値と最小値の差（ダイナミックレンジＤＲ）である。また、baseは、ブロックの基準値であり、例えばブロックの画素の最小値である。最小値以外にブロック内の画素値の平均値を使用しても良い。この正規化によって、画素の相対的レベルに注目することができる。
【００５６】
代表値決定のステップ７６では、図６の場合と同様にしてそのクラスの累積度数n(c)を求め、また、代表値g(c)を求める。すなわち、新たに形成される代表値g(c)´は、
g(c)´＝｛（ｙ−base）／ＤＲ＋n(c)×g(c)｝／n(c+1) （９）
である。このように求められたクラス毎の代表値がメモリに格納される。
【００５７】
また、クラス分けのための情報圧縮手段としては、ＡＤＲＣ回路の代わりに例えば、ＤＣＴ（Discrete Cosine Transform ）、ＶＱ（ベクトル量子化）、あるいはＤＰＣＭ（予測符号化）回路を設ける等のように、データ圧縮を行なえることができる手段であれば何を設けるかは適宜選択可能である。
【００５８】
上述したように、クラス分類適応処理回路９は、時間領域において、実際の画像の性質に基づいてＳＤ信号およびＨＤ信号の対応関係を学習し、その学習からＳＤ信号に対応するＨＤ信号を生成することができる。また、ＳＤ信号のレベル分布に応じて適応的にクラスを選択するため、画像の局所的性質に追従したアップコンバージョンが可能となる。さらに、補間フィルタを用いたものと異なり、解像度の補償されたＨＤ信号を得ることができる。
【００５９】
さて、図１に戻ると、分類回路５からの周波数領域でインパルス状の成分６ｂが供給される、第１の処理回路としてのゲイン変換回路１０は、周波数領域で解像度を補償するものである。すなわち、ゲイン変換は、図１１に示すように、もともとは、高域まで周波数特性が拡大していた信号の高域のゲインが信号処理によって低下することを補償するものである。ゲイン変換回路１０は、クラス分類適応処理回路９と同様に、予め学習によって、高域を補償するためのマッピング表が格納されたメモリを有している。このマッピング表としては、上述した時間領域のクラス分類適応処理回路９と同様に、ゲイン変換比を出力するものと、ゲインの予測値を出力するものとの２種類可能である。
【００６０】
図１２は、ゲイン変換回路１０内のマッピング表を作成するための学習時の構成を示す。８１で示す入力端子に、学習に使用するＨＤビデオデータが供給され、サブライン／サブサンプル回路８２に供給される。この回路８２は、垂直方向の間引き（サブライン）と水平方向の間引き（サブサンプル）とを行なう。従って、サブライン／サブサンプル回路８２からは、ＳＤビデオ信号と同程度の解像度を有するビデオ信号が発生する。
【００６１】
サブライン／サブサンプル回路８２に対して遅延回路８３およびＤ／Ａ変換器９０が接続される。遅延回路８３は、クラス分類がなされるまで、入力データを遅延させ、タイミングを合わせるためのものである。遅延回路８３に対してブロック化回路８４が接続され、例えば（４×４）のブロック構造のデータが同時化される。ブロック化回路８４の出力がＤＣＴ回路８５に供給され、コサイン変換がされる。ＤＣＴ回路８５からは、直流成分の係数データから開始して、交流分の係数データが低次から高次のものの順番（ジグザク走査）で係数データが発生する。
【００６２】
ＤＣＴ回路８５からの係数データが割算回路８６に供給される。この割算回路８６は、高域を補償するために必要とされる、係数データに対するゲイン変換比を求めるために設けられている。割算回路８６からのゲイン変換比信号がメモリ８７に供給される。メモリ８７は、複数のＤＣＴ係数とそれぞれ対応してゲイン変換比を記憶するために、複数枚の構成とされている。
【００６３】
信号処理の結果生じる、ＳＤビデオ信号の高域の劣化を調べるために、Ｄ／Ａ変換器９０によりアナログ信号とされたＳＤビデオ信号がアナログ伝送系９１に供給される。アナログ伝送系９１は、例えばアナログＶＴＲの記録および再生プロセスである。アナログ伝送系９１を介されたビデオ信号がＡ／Ｄ変換器９２によってディジタル信号とされ、ブロック化回路９３に供給される。
【００６４】
ブロック化回路９３によって、ブロック化回路８４の出力データと同様のブロック構造のディジタルビデオデータが形成される。ブロック化回路９３の出力データがＤＣＴ回路９４およびクラス分類回路９５に供給される。ＤＣＴ回路９４からの係数データが割算回路８６に対して供給される。同じ次数の係数データに関して、割算処理がなされ、係数データに関するゲイン変換比信号が割算回路８６で生成される。すなわち、アナログ伝送系９１を通ると、高域周波数成分が失われるが、それによって、ＤＣＴの係数データの各成分のゲイン（値）がどのように変化するかがゲイン変換比信号によって指示される。
【００６５】
例えばＤＣＴ回路８５からＤＣ、ＡＣ１〜ＡＣ１５の係数データが発生し、ＤＣＴ回路９４からＤＣ´、ＡＣ１´〜ＡＣ１５´の係数データが発生する場合を考える。割算回路８６では、下記の演算によってゲイン変換比信号Ｇ₀、Ｇ₁、・・・・、Ｇ₁₅が形成される。
Ｇ₀＝ＤＣ／ＤＣ´、Ｇ₁＝ＡＣ／ＡＣ´、・・・、Ｇ₁₅＝ＡＣ₁₅／ＡＣ₁₅´
【００６６】
図１２では、簡単のために省略しているが、各係数に関して発生する複数のゲイン変換比信号を平均化することによって、最終的なゲイン変換比信号が求められ、これがメモリ８７に記憶される。
【００６７】
このようなゲイン変換比信号は、高域が減衰したビデオデータの係数データに対して、乗じられることによって、高域が補償されたビデオデータの係数データを生成することを可能とする。図１中のゲイン変換回路１０は、予め学習により得られたゲイン変換比信号が記憶されているメモリを有し、係数データとゲイン変換比信号とを乗じることによって、係数データの値を変更する。これによって、高域の補償を行なうことができる。
【００６８】
クラス分類回路９５は、ブロック化回路９３からのブロックデータのレベル分布に応じたクラス分けを行なう。このクラス分けのために、上述したように、ＡＤＲＣ等のデータ圧縮を行なうことが好ましい。クラス分類回路９５で得られたクラス情報がメモリ８７に対して、メモリ内アドレスとして供給される。メモリ８７は、直流分の係数データと、全ての次数の交流分の係数データとのそれぞれと対応して複数枚の構成とされ、複数枚のメモリのそれぞれが対応する係数データに関してゲイン変換比信号を記憶する。
【００６９】
係数データと対応して、複数枚のメモリを切り換えるためのアドレスは、アドレスカウンタ８８により形成される。アドレスカウンタ８８は、入力端子８９からのクロック信号をカウントし、順次変化するアドレスを発生する。この場合、ブロック化回路８４からの係数データと同期してアドレスが変化する。そして、複数の種類のＨＤビデオ信号が入力端子８１に供給され、クラス毎に最適なゲイン変換比信号が形成され、これがメモリ８７に記憶される。
【００７０】
また、ゲイン変換比の代わりに、予測されるＤＣＴ係数の値を学習によって、求めることも可能である。
【００７１】
メモリ８７に格納されたゲイン変換比信号と同一のものが図１のゲイン変換回路１０に設けられたメモリ内に記憶されている。また、ブロック化回路２の出力信号がクラス分類のためにゲイン変換回路１０に供給されている。ゲイン変換回路１０において、ＤＣＴ係数データの各成分とゲイン変換比信号とが乗じられ、ゲイン調整がなされる。これによって、周波数領域の高域の補償がなされる。ここで、ゲイン変換回路１０に対しては、周波数領域でインパルス状成分６ｂが供給されている。その理由は、若し、フラット成分をも含む種々の成分からなる信号を変換しようとすると、非線形成分が混入して精度が悪化し、正しいゲイン変換ができない問題が生じるからである。同様の理由で、上述の図１２に示す学習時においても、インパルス状の信号が使用される。
【００７２】
【発明の効果】
この発明は、単なる補間フィルタによる補間と異なり、高域成分を創造することによって、解像度が入力ビデオ信号のものより高い、出力ビデオ信号を形成することができる。そして、この発明は、入力ビデオ信号を時間領域における表現に適した成分と、周波数領域における表現に適した成分とを分け、各成分を並行して処理し、各領域の処理の結果を合成するので、各領域の処理を２段階に行なうのと比較して、処理時間の短縮化、ハードウエアの規模の減少、精度の向上等の利点を得ることができる。
【図面の簡単な説明】
【図１】この発明の一実施例の全体的なブロック図である。
【図２】この発明の一実施例によりなされる解像度補償を説明するための略線図である。
【図３】この発明の一実施例におけるクラス分類適応処理回路の一例のブロック図である。
【図４】ＳＤ画像とＨＤ画像との間の画素の配列を示す略線図である。
【図５】予測係数が格納されたマッピング表を作成するための構成の一例のブロック図である。
【図６】予測値が格納されたマッピング表を作成するための構成の一例のブロック図である。
【図７】予測係数あるいは予測値を形成するための学習時の構成の一例のブロック図である。
【図８】ＳＤ画像とＨＤ画像との間の画素の配列の他の例を示す略線図である。
【図９】予測係数を形成するための学習時の処理を示すフローチャートである。
【図１０】予測値を形成するための学習時の処理を示すフローチャートである。
【図１１】周波数領域での高域補償を説明するための略線図である。
【図１２】周波数領域での高域補償用のゲイン変換比を学習するためのブロック図である。
【図１３】時間領域におけるインパルス状成分およびフラット成分をそれぞれ示す略線図である。
【図１４】周波数領域におけるインパルス状成分およびフラット成分をそれぞれ示す略線図である。
【図１５】時間領域におけるインパルス状成分およびフラット成分の両者を含む信号波形の略線図である。
【符号の説明】
１高解像度のディジタル画像信号の入力端子
３ＤＣＴ回路
５周波数領域でのフラット成分およびインパルス状成分を分離する分類回路
７、１１逆ＤＣＴ回路
９クラス分類適応処理回路
１０ゲイン変換回路[0001]
[Industrial application fields]
The present invention relates to a resolution-compensating image conversion apparatus and method capable of increasing the resolution of a digital image signal.
[0002]
[Prior art]
Conventionally, when converting a standard-definition video signal (SD signal) into a high-resolution video signal (HD signal) (so-called up-conversion), the number of pixels in the horizontal and vertical directions is doubled by an interpolation filter. However, the resolution higher than the input signal cannot be created by simple interpolation.
[0003]
In order to solve this problem, the target pixel of the HD signal is classified using peripheral SD pixels, and is obtained by learning in advance by linear primary combination of the prediction coefficient of the class and a plurality of SD pixels. A resolution compensator that forms the value of the target HD pixel has been proposed. Although this process is effective in itself, it is a process in the time domain, and therefore, a highly accurate process can be performed on a signal whose characteristics can be well expressed in these areas. In the opposite case, problems such as insufficient accuracy occur.
[0004]
Generally speaking, the problems of time domain processing and frequency domain processing are described. When performing signal processing of digital image signals, digital audio signals, etc., signal processing is usually performed in either the time domain or the frequency domain. Met. The frequency domain processing can express the steady characteristics of the signal well, but is not suitable for expressing the transient characteristics. On the other hand, processing in the time domain is suitable for expressing transient characteristics, but is not suitable for expressing steady characteristics. Here, the steady characteristic means a stable repetitive change, and the transient characteristic means an isolated one-time change.
[0005]
As an example, FIG. 13 shows a case of time domain processing. As shown in FIG. 13A, the transient characteristic has a waveform (impulse waveform) whose change is drastic with respect to the time axis, and this can be sufficiently processed by using, for example, several samples. it can. A dot in the waveform indicates a sampling position. In the case of a digital signal, the dot is a discrete signal sequence having a sampling value corresponding to the level of each sampling position. However, in the figure, the same applies to the following, but it is expressed by an analog signal waveform. On the other hand, the steady-state characteristic is a waveform (flat waveform) with a gradual change as shown in FIG. 13B on the time axis. This is sufficient even when several samples are used, and the characteristics of the waveform are not understood. Cannot be processed properly.
[0006]
Next, considering in the frequency domain, the steady-state characteristic has an impulse-like waveform as shown in FIG. On the other hand, the transient characteristic has a flat waveform as shown in FIG. 14B. As described above, the impulse-like waveform is more suitable for capturing the characteristics of the signal.
[0007]
As shown in FIG. 15, the general signal waveform has a steady characteristic (flat) portion FL1, FL2, FL3,... And a transient characteristic (impulse) portion IM1, IM2,.・・ Are mixed. Therefore, it is difficult to perform processing that correctly reflects signal characteristics by performing only one of time domain processing and frequency domain processing. Therefore, it is necessary to perform time-domain processing and frequency-domain processing on the same signal, and there are problems that the processing time becomes long and the scale of hardware for processing increases.
[0008]
Accordingly, an object of the present invention is to process the portion of the steady state characteristic of the digital image signal in the frequency domain and the portion of the transient characteristic in the time domain when performing the resolution compensation process. It is an object of the present invention to provide an image conversion apparatus and a conversion method capable of compensating for resolution, which can improve accuracy, shorten processing time, reduce the scale of hardware for processing, and the like.
[0009]
[Means for Solving the Problems]
  Invention of Claim 1The secondAnalyzing one digital image signal in frequency domainMinuteAnalyzing means,
  MinClassifying means for classifying the first digital image signal based on the output of the analyzing means;
  In order to form a second digital image signal having a second resolution higher than the first resolution, each signal classified by the classifying means is responsive to the characteristics based on the first digital image signal. First and second processing means for adaptively processing;
  An image conversion apparatus capable of compensating for resolution, comprising combining means for combining the outputs of the first and second processing means.
[0010]
  According to a seventh aspect of the present invention, there is provided an analyzing means for frequency-analyzing the first digital image signal having the first resolution;
  Separation means for separating the impulse component signal and the flat component signal in the frequency domain from the output of the analysis means;
  The impulse component signal is supplied from the separating means, and the impulse component signal is processed in the frequency domain to form a second digital image signal having a second resolution higher than the first resolution. First processing means to:
  First conversion means for converting the output from the first processing means into a time domain signal;
  A second conversion means for supplying a flat component signal from the separation means and converting the flat component signal into a time domain signal;
  A time domain signal is provided from the second conversion means and a second digital image signal having a second resolution higher than the first resolution is formed by processing the time domain signal in the time domain. Two processing means;
  An image conversion apparatus capable of compensating for resolution, comprising combining means for combining the output of the first conversion means and the output of the second processing means.
[0011]
  When the resolution is compensated, the input SD signal has an impulse-like component in the frequency domain, andflatDivided into ingredients. The impulse component is supplied to a processing circuit that performs resolution compensation processing in the frequency domain, and the flat component is supplied to a processing circuit that performs resolution compensation processing in the time domain. Then, the resultant signal processed by each processing circuit is synthesized in the time domain, and a video signal (HD signal) with resolution compensation is obtained.
[0012]
【Example】
An embodiment of a digital video signal resolution compensation apparatus according to the present invention will be described below. In FIG. 2A, the resolution compensation compensates for the fact that the wideband video signal originally shown by the frequency characteristic of 20a is narrowed by the filtering process or the like as shown by the frequency characteristic of 20b. By creating a partial component, it is converted to the wideband video signal shown in FIG. 2B.
[0013]
In FIG. 1 showing the overall configuration of this embodiment, a standard-definition digital video signal (referred to as an SD video signal) is supplied to an input terminal 1. A high-resolution digital video signal is referred to as an HD video signal. Examples of input SD video signals are SDVTR playback signals, broadcast signals, and the like. The input SD video signal is supplied to the blocking circuit 2, and the video signal in the order of the television raster is scan-converted into a signal having a block structure of (8 × 8), for example.
[0014]
A DCT (Discrete Cosine Transform) circuit 3 is connected to the block forming circuit 2, and the DCT circuit 3 includes one DC component coefficient data DC and 63 AC component components corresponding to one block. Coefficient data AC1, AC2,..., AC63 are generated. As an example, coefficient data is output by zigzag scanning starting from DC and sequentially outputting higher order AC coefficients. DCT is one means of frequency analysis of an input video signal, and FFT, Hadamard transform, etc. may be used.
[0015]
Coefficient data from the DCT circuit 3 is supplied to the classification circuit 5 via the coefficient analysis circuit 4. The coefficient analysis circuit 4 and the classification circuit 5 are provided in order to separate the steady component and the transient component of the digital video signal converted into the frequency domain. From the classification circuit 5, a flat component (that is, a transient component) 6 a in the frequency domain and an impulse-like component (that is, a steady component) 6 b appear separately.
[0016]
In order to facilitate understanding, it is assumed that an example of coefficient data values (DC = 50, AC1 = 48, AC2 = 46, AC3 = 44, AC4 = 42, AC5 = 60,...). The coefficient analysis circuit 4 analyzes the coefficient data and determines that AC5 is impulse-like. That is, AC5 should be 40 due to the changing tendency of AC1, AC2, AC3, and AC4. Since it has a value of 60, this is a 20 value. The classification circuit 5 is a flat component in the frequency domain (transient component. In the above example, DC = 50, AC1 = 48, AC2 = 46, AC3 = 44, AC4 = 42, AC5 = 40,... ) 6a and an impulse-like component in the frequency domain (which is a stationary component, in the above example, DC = 0, AC1 = 0, AC2 = 0, AC3 = 0, AC4 = 0, AC5 = 20,... ) 6b is separated and output.
[0017]
The flat component 6 a from the classification circuit 5 is supplied to the inverse DCT circuit 7, returned to the time domain signal, and supplied to the block decomposition circuit 8. From the block decomposition circuit 8, a digital video signal returned in the order of the raster scan of the television is obtained. This digital video signal is supplied to a class classification adaptive processing circuit 9 as a second processing circuit. This circuit 9 is a processing circuit for increasing the resolution in the time domain, as will be described later. The flat component 6a is suitable for processing in the time domain, and the circuit 9 can satisfactorily compensate for resolution.
[0018]
The impulse component 6 b from the classification circuit 5 is supplied to the gain conversion circuit 10. To the gain conversion circuit 10, the output signal of the blocking circuit 2 is supplied for classification. The gain conversion circuit 10 is provided with a memory that stores gain conversion ratio information acquired in advance by learning as will be described later. Thus, the high frequency component is enhanced in the frequency domain by adjusting the gain of the coefficient data according to the conversion ratio information. The output signal of the gain conversion circuit 10 is supplied to the inverse DCT circuit 11. The signal returned to the time domain by the inverse DCT circuit 11 is supplied to the block decomposition circuit 12 and converted into data in the order of television raster scanning.
[0019]
The output signal of the block decomposition circuit 12 is supplied to the synthesis circuit 14 via the phase compensation circuit 13, and is synthesized with the output signal of the class classification adaptive processing circuit 9 by the synthesis circuit 14. This synthesis is a simple multiplex process. A digital video signal whose resolution is compensated, that is, an HD video signal is obtained from the synthesis circuit 14 to the output terminal 15.
[0020]
An example of the class classification adaptive processing circuit 9 is shown in FIG. A digital video signal from the block decomposition circuit 8 is supplied to an input terminal indicated by 21. This digital video signal is a flat component (transient component) of the SD video signal and is an impulse-like signal in the time domain. This digital video signal is supplied to the synchronization circuit 22. Output data of the synchronization circuit 22 is supplied to the class classification circuit 23. The output of the class classification circuit 23 is supplied as an address signal to the memories 24a to 24d in which the mapping tables M1 to M4 are stored.
[0021]
FIG. 4 partially shows the relationship between SD images and HD images. In FIG. 4, the pixel data of “◯” is for an SD image, and the pixel data of “×” is for an HD image. For example, four HD image pixel data y1 to y4 are generated from twelve SD image pixel data a to l. The mapping table M1 of the memory 24a is for generating the pixel data y1, and the mapping tables M2, M3, and M4 of the

memories

24b, 24c, and 24d are for generating the pixel data y2, y3, and y4, respectively. is there.
[0022]
Read outputs from the memories 24 a to 24 d are supplied to the selector 25. The selector 25 is controlled by the output of the select signal generation circuit 26. The select signal generation circuit 26 is supplied with an HD image sample clock from an input terminal 27. Four pixel data y1 to y4 are sequentially selected by the selector 25, and these pixel data are supplied to the scan conversion circuit 28. The scan conversion circuit 28 generates HD image pixel data at the output terminal 29 in the order of raster scanning. The number of pixels of the output image is four times the number of pixels of the input SD video signal.
[0023]
The mapping tables M1 to M4 stored in the memories 24a to 24d are generated in advance by learning. An example of a configuration for generating the mapping tables M1 to M4 is shown in FIG. In FIG. 5, a digital HD video signal is supplied to an input terminal 31. This HD video signal is preferably a standard signal considering the generation of the mapping table. In practice, an HD video signal can be obtained by capturing a standard image with an HD video camera or by recording an image signal on an HDVTR.
[0024]
This HD video signal is supplied to the synchronization circuit 32. This synchronization circuit 32 has pixel data a to l and y having the positional relationship shown in FIG.₁~ Y_FourAre output simultaneously. Pixel data a to l are supplied to the class classification circuit 33. The class classification circuit 33 determines the HD pixel data y by gradation, pattern, etc.₁~ Y_FourClassify. The output of the class classification circuit 33 is commonly supplied to the mapping table generation circuits 34a to 34d.
[0025]
Pixel data y from the synchronization circuit 32₁~ Y_FourIs supplied to the mapping table generation circuits 34a to 34d. The mapping table generation circuits 34a to 34d have the same configuration. There are two types of mapping tables. One of them is the HD pixel value y₁, Y₂, Y_ThreeOr y_FourSD pixel values a to l and coefficient w₁~ W₁₂In this case, the coefficient w for each class₁~ W₁₂Is determined. The other is the HD pixel value itself predicted for each class.
[0026]
The mapping tables indicating the correlation between the HD video signal and the SD video signal are stored in the memories provided in the mapping table creation circuits 34a to 34d in FIG. In other words, when a plurality of data of the SD video signal is given, a mapping table for outputting pixel data of the HD video signal which can be averagely associated with the plurality of data classes can be formed.
[0027]
Similar to the class classification circuit 23 of FIG. 3, the class classification circuit 33 classifies the target pixel data and generates class information. As class classification, class classification by gradation, class classification by pattern, or the like can be used. When using gradation, if the pixel data is 8 bits, the number of classes becomes extremely large. Therefore, it is preferable to reduce the number of bits of each pixel by high-efficiency encoding such as ADRC. When using a pattern, prepare a plurality of patterns composed of four pixels (for example, flat, the value increases at the upper right, the value decreases at the lower right, etc.), and the output data of the synchronization circuit 32 is converted into a plurality of patterns. Classify into one.
[0028]
HD pixel data y₁For example, the mapping table creation circuit 34a for obtaining the above is provided with a memory to which the class information from the class classification circuit 33 is supplied as an address. At the time of training (learning), the SD video signal is formed by thinning the original HD video signal. A thinning process (subsampling) in the horizontal direction and a thinning process (subline) in the vertical direction are performed. An HD video signal of one frame or more, for example, a still image is used. The memory stores pixel data a to l and y for each address corresponding to the class information.₁The sample value of is written. For example, the memory address AD0 has (a_Ten, A₂₀... a_n0) (B_Ten, B₂₀, ..., b_n0) ... (l_Ten, L₂₀... l_n0) (Y_Ten, Y₂₀... y_n0) Is stored.
[0029]
The learning data stored in this way is read from the memory, and SD pixel values a to l and coefficient w₁~ W₁₂HD pixels (y₁The coefficient that minimizes the error between the predicted value and the true value is obtained by the least square method. When attention is paid to the learning data stored in one memory address, the following simultaneous equations hold for this address.
[0030]
y_Ten= W₁a_Ten+ W₂b_Ten+ W_Threec_Ten＋・・・・・・＋ w₁₂l_Ten
y₂₀= W₁a₂₀+ W₂b₂₀+ W_Threec₂₀＋・・・・・・＋ w₁₂l₂₀
y₃₀= W₁a₃₀+ W₂b₃₀+ W_Threec₃₀＋・・・・・・＋ w₁₂l₃₀
・
・
・
y_n0= W₁a_n0+ W₂b_n0+ W_Threec_n0＋・・・・・・＋ w₁₂l_n0
[0031]
Where y_Ten~ Y_n0, A_Ten~ A_n0, B_Ten~ B_n0, C_Ten~ C_n0... l_Ten~ L_n0Is known, so y_Ten~ Y_n0A coefficient w that minimizes the square of the error of the predicted value with respect to (true value)₁~ W₁₂Can be requested. Coefficients can be similarly determined for other classes (addresses). The coefficients thus determined are stored in a memory and used as a mapping table.
[0032]
Not only the coefficient but also the HD video signal data value for each class may be obtained by training and stored in the memory. For example, FIG. 6 shows a configuration for that purpose. A data memory 40 and a frequency memory 41 to which class information from the class classification circuit 33 is supplied as an address are provided.
[0033]
The read output of the frequency memory 41 is supplied to the adder 42 and incremented by 1, and the output of the adder 42 is written to the same address of the memory 41. In the memories 40 and 41, the contents of each address are cleared to zero as an initial state.
[0034]
Data read from the data memory 40 is supplied to the multiplier 43 and multiplied by the frequency read from the frequency memory 41. The output of the multiplier 43 is supplied to the adder 44 and is added to the input data y by the adder 44. The output of the adder 44 is supplied to the divider 45 as a divisor. The output (quotient) of the divider 45 is input to the data memory 40.
[0035]
In the configuration of FIG. 6 described above, when a certain address is accessed for the first time, the read output of the memories 40 and 41 is 0, so the data y_TenIs written in the memory 40 as it is, and the value of the corresponding address in the memory 41 is set to 1. If this address is accessed again later, the output of adder 42 is 2 and the output of adder 44 is (y_Ten+ Y₂₀). Therefore, the output of the divider 45 is (y_Ten+ Y₂₀) / 2, which is written into the memory 40. Further, when the above address is accessed thereafter, the data in the memory 40 is (y_Ten+ Y₂₀+ Y₃₀) / 3 and the frequency is updated to 3.
[0036]
By performing the above-described operation for a predetermined period, the memory 40 stores a mapping table in which when the class is designated by the output of the class classification circuit 33, the data at that time is output. In other words, when a plurality of pixel data of the input video signal is given, a mapping table can be formed that outputs data that is averagely associated with the classification of the data.
[0037]
The class classification adaptive processing circuit 9 will be described in more detail. As described above, the class classification adaptive processing circuit 9 predetermines the linear linear combination coefficient by training as described above. During this training, the configuration shown in FIG. 7 is used. In FIG. 7, reference numeral 51 denotes an input terminal to which a large number of standard HD signal still images are input and supplied to the vertical thinning filter 52 and the learning unit 54. The vertical thinning filter 52 thins the HD image by half in the vertical direction. Connected to the vertical thinning filter 52, the horizontal thinning filter 53 performs half thinning in the horizontal direction, and supplies a still image of pixels equivalent to the SD signal to the learning unit 54. The memory 55 stores the class code created by the learning unit 54 and the learning result.
[0038]
In this example, as shown in FIG. 8, the positional relationship between HD pixels and SD pixels is defined. As shown in FIG. 8, when the SD pixel (3 × 3) block is used, the SD pixels a to i and the HD pixels A, B, C, and D are a set of learning data. There are a plurality of sets of learning data for one frame, and a very large number of sets of learning data can be used by increasing the number of frames.
[0039]
Here, FIG. 9 is a flowchart showing an operation when the learning unit 54 determines the linear primary coupling coefficient by software. Control of the learning unit is started from step 61, and in the corresponding data block conversion of step 62, HD signals and SD signals are supplied, and processing for extracting HD pixels and SD pixels having an arrangement relationship as shown in FIG. 8 is performed. At the end of the data in step 63, the control shifts to the prediction coefficient determination in step 66 if the processing of all input data, for example, one frame of data has been completed, and to the class determination in step 64 if not.
[0040]
In the class determination at step 64, the class is determined from the signal pattern of the SD signal. In this control, ADRC can be used to reduce the number of bits. In the normal equation addition in step 65, an equation as described later is created.
[0041]
After the processing of all data from the end of the data in step 63, the control moves to step 66. In the prediction coefficient determination in step 66, an equation described later is solved using a matrix solution method to determine the prediction coefficient. In the prediction coefficient store in step 67, the prediction coefficient is stored in the memory, and in step 68, the control of the learning unit is finished. In the memory, a class determined by the SD signal is used as an address, and a prediction coefficient of the class is stored. Class and prediction coefficient correspond to the mapping table described above.
[0042]
A process for obtaining a coefficient for defining the relationship between the HD pixel and the SD pixel in FIG. 8 will be described in more detail. Generally, the SD pixel level is x₁~ X_nAnd when the HD pixel level is y, the coefficient w for each class₁~ W_nN-tap linear estimation formula
y '= w₁x₁+ W₂x₂+ ... + w_nx_n (1)
Set. Before learning_iIs an undetermined coefficient.
[0043]
As described above, learning is performed on a plurality of HD data and SD data for each class. When the number of data is m, according to Equation 1,
y_j'= W₁x_{j 1}+ W₂x₂2+ ... + w_nx_jn (2)
(However, j = 1, 2, ... m)
[0044]
If m> n, w₁~ W_nIs not uniquely determined, so the elements of the error vector e are
e_j= Y_j-(W₁x_j1+ W₂x_j2+ ... + w_nx_jn(3)
(However, j = 1, 2, ... m)
And a coefficient that minimizes the following Expression 4 is obtained.
[0045]
[Expression 1]

[0046]
This is a so-called least square method. Where w in Equation 3_iObtain the partial differential coefficient by.
[0047]
[Expression 2]

[0048]
Each w so that Equation 6 is 0_iBecause you only have to decide
[0049]
[Equation 3]

[0050]
As a matrix
[0051]
[Expression 4]

[0052]
If this equation 8 is solved using a general matrix solving method such as the sweep-out method, the prediction coefficient w_iThe prediction coefficient w is obtained using the class code as an address._iIs stored in memory.
[0053]
As described above, the learning unit uses the HD signal that is actual data to predict the prediction coefficient w._iCan be obtained and stored in memory. Then, class information is formed from an arbitrary input SD signal, the prediction coefficient corresponding to the class information is read from the memory, and the value of the SD pixel around the target pixel and the prediction coefficient are linearly combined linearly to obtain the target pixel. A value can be formed and an output HD image can be generated for any input SD image.
[0054]
When the learning unit 54 determines not a prediction coefficient but a representative value for each class, a process as shown in the flowchart of FIG. 10 is performed. The start step 71, the learning data formation step 72, the data end step 73, and the class determination step 74 are the same as the above-described steps 61, 62, 63 and 64 in FIG.
[0055]
In the normalization step 75, the pixel values are normalized. That is, if the value (input value) of the HD pixel is y, the input data is normalized by the calculation of (y-base) / DR. Here, DR is a difference (dynamic range DR) between the maximum value and the minimum value of pixels in one block when a to i are one block in the pixel array shown in FIG. Further, base is a reference value of the block, for example, a minimum value of the pixel of the block. In addition to the minimum value, an average value of pixel values in the block may be used. This normalization allows attention to the relative level of the pixels.
[0056]
In step 76 for determining the representative value, the cumulative frequency n (c) of the class is obtained as in the case of FIG. 6, and the representative value g (c) is obtained. That is, the newly formed representative value g (c) ′ is
g (c) ′ = {(y−base) / DR + n (c) × g (c)} / n (c + 1) (9)
It is. The representative value for each class obtained in this way is stored in the memory.
[0057]
As an information compression means for classification, for example, a DCT (Discrete Cosine Transform), a VQ (vector quantization), or a DPCM (predictive coding) circuit is provided instead of the ADRC circuit. What is provided can be appropriately selected as long as it is a means capable of performing compression.
[0058]
As described above, the class classification adaptive processing circuit 9 learns the correspondence relationship between the SD signal and the HD signal based on the actual image properties in the time domain, and generates an HD signal corresponding to the SD signal from the learning. be able to. In addition, since the class is adaptively selected according to the level distribution of the SD signal, up-conversion that follows the local nature of the image is possible. Further, unlike the case where an interpolation filter is used, an HD signal whose resolution is compensated can be obtained.
[0059]
Returning to FIG. 1, the gain conversion circuit 10 as the first processing circuit to which the impulse-like component 6b is supplied in the frequency domain from the classification circuit 5 compensates the resolution in the frequency domain. That is, as shown in FIG. 11, the gain conversion compensates for a decrease in high-frequency gain of a signal whose frequency characteristics have been expanded to high frequencies due to signal processing. Similar to the class classification adaptive processing circuit 9, the gain conversion circuit 10 has a memory in which a mapping table for compensating for high frequencies by learning is stored in advance. There are two types of mapping tables, one that outputs a gain conversion ratio and one that outputs a predicted value of gain, as in the time domain class classification adaptive processing circuit 9 described above.
[0060]
FIG. 12 shows a configuration at the time of learning for creating a mapping table in the gain conversion circuit 10. HD video data used for learning is supplied to an input terminal indicated by 81 and supplied to the subline / subsample circuit 82. This circuit 82 performs vertical thinning (subline) and horizontal thinning (subsample). Accordingly, the subline / subsample circuit 82 generates a video signal having a resolution comparable to that of the SD video signal.
[0061]
Delay circuit 83 and D / A converter 90 are connected to subline / subsample circuit 82. The delay circuit 83 is for delaying input data and matching timing until classification is performed. A blocking circuit 84 is connected to the delay circuit 83 and, for example, data having a block structure of (4 × 4) is synchronized. The output of the blocking circuit 84 is supplied to the DCT circuit 85, and cosine transform is performed. Starting from the DC component coefficient data, the DCT circuit 85 generates coefficient data in the order of low-order to high-order coefficient data (zigzag scanning).
[0062]
The coefficient data from the DCT circuit 85 is supplied to the division circuit 86. This division circuit 86 is provided for obtaining a gain conversion ratio for coefficient data, which is required for compensating the high frequency band. A gain conversion ratio signal from the division circuit 86 is supplied to the memory 87. The memory 87 has a plurality of configurations in order to store gain conversion ratios corresponding to the plurality of DCT coefficients.
[0063]
In order to investigate high-frequency degradation of the SD video signal resulting from the signal processing, the SD video signal converted into an analog signal by the D / A converter 90 is supplied to the analog transmission system 91. The analog transmission system 91 is, for example, an analog VTR recording and reproduction process. The video signal passed through the analog transmission system 91 is converted into a digital signal by the A / D converter 92 and supplied to the blocking circuit 93.
[0064]
The block circuit 93 forms digital video data having the same block structure as the output data of the block circuit 84. The output data of the blocking circuit 93 is supplied to the DCT circuit 94 and the class classification circuit 95. The coefficient data from the DCT circuit 94 is supplied to the division circuit 86. Division processing is performed on the coefficient data of the same order, and a gain conversion ratio signal regarding the coefficient data is generated by the division circuit 86. That is, when passing through the analog transmission system 91, the high frequency component is lost, but how the gain (value) of each component of the DCT coefficient data changes thereby is indicated by the gain conversion ratio signal. .
[0065]
For example, consider a case where coefficient data of DC and AC1 to AC15 are generated from the DCT circuit 85, and coefficient data of DC ′ and AC1 ′ to AC15 ′ are generated from the DCT circuit 94. The division circuit 86 performs the gain conversion ratio signal G by the following calculation.₀, G₁... G₁₅Is formed.
G₀= DC / DC ', G₁= AC / AC ', ..., G₁₅= AC₁₅/ AC₁₅´
[0066]
Although omitted in FIG. 12 for the sake of simplicity, a final gain conversion ratio signal is obtained by averaging a plurality of gain conversion ratio signals generated for each coefficient and stored in the memory 87. .
[0067]
Such a gain conversion ratio signal is multiplied by coefficient data of video data whose high frequency is attenuated, thereby making it possible to generate coefficient data of video data whose high frequency is compensated. The gain conversion circuit 10 in FIG. 1 has a memory in which a gain conversion ratio signal obtained by learning in advance is stored, and changes the value of the coefficient data by multiplying the coefficient data and the gain conversion ratio signal. . As a result, high-frequency compensation can be performed.
[0068]
The class classification circuit 95 performs classification according to the level distribution of the block data from the blocking circuit 93. For this classification, it is preferable to perform data compression such as ADRC as described above. The class information obtained by the class classification circuit 95 is supplied to the memory 87 as an in-memory address. The memory 87 has a plurality of configurations corresponding to each of the coefficient data for DC and the coefficient data for all orders of AC, and the gain conversion ratio signal for the coefficient data corresponding to each of the plurality of memories. Remember.
[0069]
Corresponding to the coefficient data, an address for switching a plurality of memories is formed by an address counter 88. The address counter 88 counts the clock signal from the input terminal 89 and generates sequentially changing addresses. In this case, the address changes in synchronization with the coefficient data from the blocking circuit 84. Then, a plurality of types of HD video signals are supplied to the input terminal 81, and an optimum gain conversion ratio signal is formed for each class and stored in the memory 87.
[0070]
Further, instead of the gain conversion ratio, a predicted DCT coefficient value can be obtained by learning.
[0071]
The same gain conversion ratio signal stored in the memory 87 is stored in the memory provided in the gain conversion circuit 10 of FIG. Further, the output signal of the blocking circuit 2 is supplied to the gain conversion circuit 10 for classification. In the gain conversion circuit 10, each component of the DCT coefficient data is multiplied by the gain conversion ratio signal, and gain adjustment is performed. This compensates for the high frequency region. Here, an impulse component 6b is supplied to the gain conversion circuit 10 in the frequency domain. The reason is that if a signal composed of various components including a flat component is to be converted, a nonlinear component is mixed therein, resulting in a problem that accuracy is deteriorated and correct gain conversion cannot be performed. For the same reason, an impulse signal is also used during the learning shown in FIG.
[0072]
【The invention's effect】
The present invention can form an output video signal whose resolution is higher than that of the input video signal by creating a high frequency component, unlike interpolation by a simple interpolation filter. The present invention divides a component suitable for expression in the time domain and a component suitable for expression in the frequency domain from the input video signal, processes each component in parallel, and synthesizes the processing results of each domain. Therefore, advantages such as a reduction in processing time, a reduction in hardware scale, and an improvement in accuracy can be obtained as compared with processing in each area in two stages.
[Brief description of the drawings]
FIG. 1 is an overall block diagram of one embodiment of the present invention.
FIG. 2 is a schematic diagram for explaining resolution compensation performed according to an embodiment of the present invention;
FIG. 3 is a block diagram of an example of a class classification adaptive processing circuit according to an embodiment of the present invention.
FIG. 4 is a schematic diagram illustrating an arrangement of pixels between an SD image and an HD image.
FIG. 5 is a block diagram illustrating an example of a configuration for creating a mapping table storing prediction coefficients.
FIG. 6 is a block diagram illustrating an example of a configuration for creating a mapping table storing predicted values.
FIG. 7 is a block diagram of an example of a configuration at the time of learning for forming a prediction coefficient or a prediction value.
FIG. 8 is a schematic diagram illustrating another example of an arrangement of pixels between an SD image and an HD image.
FIG. 9 is a flowchart showing processing at the time of learning for forming a prediction coefficient.
FIG. 10 is a flowchart showing processing during learning for forming a predicted value.
FIG. 11 is a schematic diagram for explaining high-frequency compensation in a frequency domain.
FIG. 12 is a block diagram for learning a gain conversion ratio for high frequency compensation in a frequency domain.
FIG. 13 is a schematic diagram showing an impulse component and a flat component in the time domain, respectively.
FIG. 14 is a schematic diagram showing an impulse component and a flat component in the frequency domain, respectively.
FIG. 15 is a schematic diagram of a signal waveform including both an impulse-like component and a flat component in the time domain.
[Explanation of symbols]
1 High resolution digital image signal input terminal
3 DCT circuit
5 Classification circuit that separates flat and impulse components in the frequency domain
7, 11 Inverse DCT circuit
9 class classification adaptive processing circuit
10 Gain conversion circuit

Claims

Analyzing means for analyzing the first digital image signal in the frequency domain;
Classification means for classifying the first digital image signal into an impulse component signal and a flat component signal in the frequency domain based on the output of the analysis means;
A first process for supplying the impulse component signal and processing the impulse component signal in a frequency domain to form an output digital image signal having a second resolution higher than the first resolution; Means,
Second processing means for receiving the flat component signal and processing the flat component signal in a time domain to form an output digital image signal having a second resolution higher than the first resolution; ,
By synthesizing the output digital image signals of the first and second processing means that ing and a second synthesizing means for outputting a digital image signal having a higher second resolution as compared with the first resolution An image conversion apparatus capable of compensating for resolution.

The image conversion apparatus according to claim 1 ,
The first processing means includes
Image converting apparatus characterized by having conversion means for converting the signal of the impulse JoNaru fraction was processed in the frequency domain to the time domain signal.

The image conversion apparatus according to claim 2,
The second processing means includes
It said second conversion means, an image conversion apparatus characterized by having conversion means for converting the signals above notated rats component to a time domain signal.

The image conversion apparatus according to claim 1,
An image conversion apparatus, wherein the frequency domain analysis means is orthogonal transform.

The image conversion apparatus according to claim 4 , wherein
The orthogonal transform, the image conversion apparatus which is a discrete cosine transform.

The image conversion apparatus according to claim 4 , wherein
The orthogonal transform, the image conversion apparatus which is a fast Fourier transform.

The image conversion apparatus according to claim 4 , wherein
The orthogonal transform, the image conversion apparatus which is a Hadamard transform.

The image conversion apparatus according to claim 1 ,
The first processing means is supplied with a signal of the impulse component, and class classification means for determining a class based on the first digital image signal;
Correction value generating means for generating a correction value for correcting the signal of the impulse component so as to obtain a second resolution higher than the first resolution for each of the determined classes;
Resolution capable of compensating for image conversion apparatus characterized by comprising a correction means for correcting the signal of the impulse-like components by the generated correction value.

The image conversion apparatus according to claim 8 , wherein
The correction value generating means has a memory for storing a correction value for each class,
The correction value for each class has the first resolution lower than the second resolution obtained by processing the digital image signal having the second resolution and the digital image signal having the second resolution. An image conversion apparatus characterized by being obtained in advance by learning using a digital image signal.

The image conversion apparatus according to claim 9 , wherein
An image conversion apparatus using an impulse component signal in a frequency domain as data used for learning.

The image conversion apparatus according to claim 9 , wherein
At the time of learning for obtaining the correction value for each class, a digital image signal having a second resolution is formed by passing the digital image signal having the second resolution through an analog processing system, and the second digital image Image conversion characterized in that a ratio between a component obtained by converting the signal into the frequency domain and a component obtained by converting the digital image signal in which the high frequency component is attenuated into the frequency domain is obtained, and a correction value for each class is obtained from the ratio. apparatus.

The image conversion apparatus according to claim 1 ,
The first processing means is supplied with a signal of the impulse-like component, and class-classifying means for determining a class based on the signal of the impulse-like component;
Resolution compensation is possible, comprising correction value generation means for generating a value indicating an impulse-like component signal having a second resolution higher than the first resolution for each of the determined classes. Image conversion device.

In the image conversion apparatus according to claim 1 2,
The correction value generating means has a memory for storing a value indicating an impulse component signal having a second resolution for each class,
The value indicating the impulse component signal having the second resolution for each class is obtained by processing the digital image signal having the second resolution and the digital image signal having the second resolution. An image conversion apparatus characterized in that it is obtained by learning in advance using a digital image signal having the first resolution lower than the resolution of 2.

In the image conversion apparatus according to claim 1 3,
An image conversion apparatus using an impulse component signal in a frequency domain as data used for learning.

In the image conversion apparatus according to claim 1 3,
At the time of learning for obtaining the correction value for each class, a digital image signal having a second resolution is formed by passing the digital image signal having the second resolution through an analog processing system, and the second digital image Image conversion characterized in that a ratio between a component obtained by converting the signal into the frequency domain and a component obtained by converting the digital image signal in which the high frequency component is attenuated into the frequency domain is obtained, and a correction value for each class is obtained from the ratio. apparatus.

The image conversion apparatus according to claim 3 .
Class said second processing means uses a signal in the upper Symbol of which is the output time domain signal of the conversion means, a plurality of pixel positions that exist signals at the pixel-of-interest position to its spatial and / or near and class classification means for dividing the prediction coefficient generating means for generating a prediction coefficient for each determined class, the linear value of 1 and the prediction coefficients of the plurality of pixels of the time domain signal from the upper Symbol conversion means primary An image conversion apparatus characterized by comprising estimation means for estimating a predicted value by combination.

The image conversion apparatus according to claim 16 , wherein
The prediction coefficient generating means has a memory for storing a prediction coefficient for each class,
The value of the pixel of interest is generated by linear primary combination of the values of a plurality of pixels spatially and / or temporally adjacent to the pixel of interest and the prediction coefficient, which are included in the time domain signal from the second conversion means An image conversion apparatus characterized in that a prediction coefficient for each class that minimizes an error between a created value and a true value of the target pixel is obtained in advance by learning.

The image conversion apparatus according to claim 3 .
The second processing means includes a class classification means for determining a class of the target pixel based on signals at a plurality of pixel positions that are spatially and / or temporally adjacent to the signal at the target pixel position;
An image conversion apparatus comprising: predicted value generation means for generating a predicted value for each determined class in order to generate the value of the target pixel.

The image conversion apparatus according to claim 18 , wherein
The predicted value generating means has a memory for storing a predicted value for each class,
An image conversion apparatus characterized in that a value obtained by dividing a cumulative value obtained for each class during learning by a cumulative frequency is stored in the memory as a predicted value for each class.

The image conversion apparatus according to claim 19 , wherein
The predicted value generating means has a memory for storing a predicted value for each class,
During learning, a block consisting of a plurality of pixels including the target pixel is formed,
The dynamic range in the block normalizes the value obtained by combining the reference value of the block from the value of the pixel of interest,
Image converting apparatus characterized by value obtained by dividing the accumulated value of the normalized values in cumulative frequency as a predicted value for each class, stored in the memory.

Frequency analyzing a first digital image signal having a first resolution;
Separating the impulse component signal and the flat component signal in the frequency domain based on the analysis results;
In order to provide the impulse component signal and process the impulse component signal in the frequency domain to form an output digital image signal having a second resolution higher than the first resolution. A first processing step;
A second processing step in which the flat component signal is supplied and the flat component signal is processed in the time domain to form an output digital image signal having a second resolution higher than the first resolution; ,
The first conversion step of the output digital image signal and the second processing step of the output digital image signal and the resolution compensation possible image conversion method characterized by comprising a synthesizing step for synthesizing.